Unix - Text Processing with sed in UNIX
Introduction
sed (Stream Editor) is a powerful command-line utility in UNIX and Linux systems used for parsing, filtering, and transforming text. Unlike interactive text editors such as vi or nano, sed processes text non-interactively. It reads input line by line, performs specified operations, and outputs the modified text.
The name "Stream Editor" comes from its ability to process a continuous stream of text. It is widely used in shell scripting, system administration, log file analysis, data transformation, and automation tasks.
Why Use sed?
The sed command is useful because it:
-
Automates repetitive text editing tasks.
-
Processes large files efficiently.
-
Works directly from the command line.
-
Can be combined with other UNIX commands.
-
Supports pattern matching using regular expressions.
-
Eliminates the need to manually edit files.
Basic Syntax
sed 'command' filename
Example:
sed 's/apple/orange/' fruits.txt
This command replaces the first occurrence of "apple" with "orange" in each line and displays the result on the screen.
How sed Works
The operation of sed follows these steps:
-
Reads one line from input.
-
Stores it in a temporary buffer called the pattern space.
-
Applies specified commands.
-
Outputs the modified line.
-
Reads the next line and repeats the process.
This continues until the entire file has been processed.
The Substitution Command
The most commonly used sed operation is substitution.
Syntax:
s/search_pattern/replacement/
Example:
sed 's/cat/dog/' pets.txt
Input:
The cat is sleeping.
The cat likes milk.
Output:
The dog is sleeping.
The dog likes milk.
Only the first occurrence in each line is replaced.
Replacing All Occurrences
To replace every occurrence within a line, use the g flag.
sed 's/cat/dog/g' pets.txt
Input:
cat cat cat
Output:
dog dog dog
The g stands for global replacement.
Replacing Text in a Specific Line
You can target a particular line number.
sed '3s/error/success/' file.txt
This replaces the word "error" with "success" only on line 3.
Printing Specific Lines
By default, sed prints every processed line.
To print only specific lines:
sed -n '5p' file.txt
This displays only line 5.
Example:
sed -n '1,10p' file.txt
Prints lines 1 through 10.
Deleting Lines
Delete a Specific Line
sed '3d' file.txt
Deletes line 3.
Delete a Range of Lines
sed '5,10d' file.txt
Deletes lines 5 through 10.
Delete Blank Lines
sed '/^$/d' file.txt
Removes empty lines from the file.
Inserting Text
Text can be inserted before a line.
sed '3i\New Line Added' file.txt
Output:
Line 1
Line 2
New Line Added
Line 3
Appending Text
To add text after a line:
sed '3a\Additional Information' file.txt
Output:
Line 1
Line 2
Line 3
Additional Information
Line 4
Changing Entire Lines
The c command replaces an entire line.
sed '2c\This line has been replaced' file.txt
Output:
Line 1
This line has been replaced
Line 3
Using Regular Expressions
Regular expressions make sed extremely powerful.
Replace Any Digit
sed 's/[0-9]/X/g' file.txt
Input:
Room 123
Output:
Room XXX
Replace Alphabetic Characters
sed 's/[A-Za-z]/#/g'
Input:
Hello123
Output:
#####123
Searching for Patterns
Print lines containing a specific word.
sed -n '/error/p' logfile.txt
This displays all lines containing the word "error".
Deleting Lines Matching a Pattern
sed '/error/d' logfile.txt
Removes all lines containing "error".
Replacing Text Between Delimiters
Sometimes file paths contain slashes, making substitutions difficult.
Instead of /, another delimiter can be used.
sed 's|/home/user|/backup/user|g'
This improves readability when dealing with paths.
Editing Files Directly
Normally, sed only displays modified output.
To modify the original file:
sed -i 's/old/new/g' file.txt
The -i option performs in-place editing.
Example:
sed -i 's/admin/administrator/g' users.txt
The changes are saved directly to the file.
Multiple Commands
Several operations can be performed together.
sed -e 's/error/warning/g' -e 's/fail/pass/g' file.txt
This executes both substitutions sequentially.
Working with Log Files
System administrators often use sed for log analysis.
Example:
sed '/DEBUG/d' application.log
Removes all debug messages.
Example:
sed 's/ERROR/CRITICAL/g' application.log
Updates error labels.
Using sed in Shell Scripts
Example script:
#!/bin/bash
sed 's/January/Jan/g' report.txt
This automates text conversion during script execution.
Practical Applications of sed
Configuration File Updates
sed -i 's/port=8080/port=9090/' config.conf
Cleaning Data
sed '/^$/d' data.txt
Removing Comments
sed '/^#/d' script.sh
Formatting Reports
sed 's/Department/Division/g' report.txt
Log File Processing
sed '/INFO/d' server.log
Advantages of sed
-
Fast and lightweight.
-
Handles large files efficiently.
-
Supports automation.
-
Integrates well with shell scripts.
-
Provides powerful pattern matching.
-
Reduces manual editing effort.
Limitations of sed
-
Complex commands can be difficult to understand.
-
Not suitable for highly interactive editing.
-
Multi-line processing can become complicated.
-
Error handling is limited compared to programming languages.
Best Practices
-
Test commands before using the
-ioption. -
Keep backups of important files.
-
Use meaningful regular expressions.
-
Combine sed with grep, awk, and shell scripts for advanced processing.
-
Document complex sed commands for future maintenance.
Conclusion
sed is one of the most powerful text-processing utilities available in UNIX. It allows users to search, replace, insert, delete, and transform text efficiently using simple commands and regular expressions. Because of its speed, flexibility, and scripting capabilities, sed remains an essential tool for system administrators, developers, and data-processing professionals who work extensively with text files and command-line environments.