Linux sed命令的使用
Linux sed命令的使用
PART1 Linux sed命令的基础
初识sed
Reference: Sed Tutorial - Tutorialspoint.
字符流编辑工具(行编辑工具),可以每行字符进行处理
优势:擅长对行进行处理;擅长对文件内容进行修改/删除。
SED can be used in many different ways, such as:
- Text substitution,
- Selective printing of text files,
- In-a-place editing of text files,
- Non-interactive editing of text files, and many more
sed工作流程
sed的工作流程图如下(本图源于Sed Tutorial - Tutorialspoint)
- Read: SED reads a line from the input stream (file, pipe, or stdin) and stores it in its internal buffer called pattern buffer.
- Execute: All SED commands are applied sequentially on the pattern buffer. By default, SED commands are applied on all lines (globally) unless line addressing is specified.
- Display: Send the (modified) contents to the output stream. After sending the data, the pattern buffer will be empty.
- The above process repeats until the file is exhausted.
Point to Note:
- Pattern buffer is a private, in-memory, volatile storage area used by the SED.
- By default, all SED commands are applied on the pattern buffer, hence the input file remains unchanged. GNU SED provides a way to modify the input file in-a-place. We will explore about it in later sections.
- There is another memory area called hold buffer which is also private, in- memory, volatile storage area. Data can be stored in a hold buffer for later retrieval. At the end of each cycle, SED removes the contents of the pattern buffer but the contents of the hold buffer remains persistent between SED cycles. However SED commands cannot be directly executed on hold buffer, hence SED allows data movement between the hold buffer and the pattern buffer.
- Initially both pattern and hold buffers are empty.
- If no input files are provided, then SED accepts input from the standard input stream (stdin).
- If address range is not provided by default, then SED operates on each line.
sed的通用语法格式
1 | sed [OPTION]... [SCRIPT] [input-file]... |
[SCRIPT]: 可以用分号(;)分隔每一条表达式,每一条表达式应该是条件+命令组成的。比如"1d"(删除第一行), “1d;3d”(删除第1和3行), "1d"中’1’是条件范围,'d’是命令,表删除操作。
整个SCRIPT的所有表达式作为一个整体括起来时,最好使用单引号(')括起来,有些命令在双引号(")中无法正常使用,比如sed -n "/^/!p"中无法识别'!'号,而使用sed -n '/^/!p’就可以正常使用。因此推荐单引号括[SCRIPT],即:
1 | sed [OPTION]... '[SCRIPT]' [input-file]... |
什么是pattern space(模式空间)?
linux - The Concept of ‘Hold space’ and ‘Pattern space’ in sed - Stack Overflow:
When sed reads a file line by line, the line that has been currently read is inserted into the pattern buffer (pattern space). Pattern buffer is like the temporary buffer, the scratchpad where the current information is stored. When you tell sed to print, it prints the pattern buffer.
pattern spaces又叫做pattern buffer(模式缓冲区)。
sed的工作流程可知,sed每读入输入文件流的一行内容,这一行内容就会插入模式缓冲区,也就是模式空间中,当所有行内容都读入后,文件内容全部都插入到模式空间了。
什么是hold space(保留空间)?
linux - The Concept of ‘Hold space’ and ‘Pattern space’ in sed - Stack Overflow:
Hold buffer / hold space is like a long-term storage, such that you can catch something, store it and reuse it later when sed is processing another line. You do not directly process the hold space, instead, you need to copy it or append to the pattern space if you want to do something with it. For example, the print command
p
prints the pattern space only. Likewise,s
operates on the pattern space.
hold space又叫做hold buffer(保留缓冲区)。只需知道保留缓冲区可以存储输入文件内容的数据,在sed处理其他一行信息时,会用到这个缓冲区的内容,并且你无法直接操作此区域。
Unix Sed Tutorial : 7 Examples for Sed Hold and Pattern Buffer Operations:
As its name implies, sed hold buffer is used to save all or part of the sed pattern space for subsequent retrieval. The contents of the pattern space can be copied to the hold space, then back again. No operations are performed directly on the hold space. sed provides a set of hold and get functions to handle these movements.
…
PART2 使用sed命令
Reference: Linux sed命令完全攻略(超级详细).
使用sed打印模式空间内容
抑制"自动打印模式空间内容到终端"的特性(‘-n’)
Option ‘-n’
-n
(–quiet, --silent) - suppress automatic printing of pattern space'p’用于表达式{script-only-if-no-other-script}中
p
- Print the current pattern space.
从"什么是pattern space(模式空间)?"我们可以知道所有行内容都存在了模式空间,因此如果不指定-n
选项,所有行内容会打印到终端上:
test.txt内容
1 | 101:Java,ABC |
'1p’表示打印输入文件的第一行
1 | sed '1p' test.txt |
可以发现虽然第一行打印了,但是整个文件内容也一并打印了。我们看看加上’-n’选项后会怎么样:
1 | sed -n '1p' test.txt |
由此可以知道sed命令处理完成之后,总是会把文件内容打印到终端,为了只获得相关内容,我们使用’-n’为了抑制这一特性。p命令总是配合’-n’一起使用。
单行和多行捕获
sed p一般语法格式如下
1 | sed [OPTIONS...] '[ADDRESS]p' INPUT_FILE |
- ‘<n>p’ - 捕获输入文件内容的第n行
1 | sed -n '1p' test.txt |
- ‘<n>,<m>p’ - 捕获输入文件内容的第n到m行
1 | sed -n '1,3p' test.txt |
- ‘<n1>p;<n2>p;…;<ni>p’ - 捕获输入文件内容的多行,行号自定义
1 | sed -n '1p;3p' test.txt |
捕获匹配指定文本的行
- ‘/[match_text…]/p’ - 捕获匹配指定文本([match_text…])的行.
1 | 找到包含"Java"信息的行 |
- ‘/[match_text_1…]/p;/[match_text_2…]/p;…;/[match_text_i…]/p’ - 捕获匹配指定文本的多行.
1 | 找到包含"Java"和"PHP"的行 |
使用sed添加内容
i \
test - Insert text, which has each embedded newline preceded by a backslash.
a \
text - Append text, which has each embedded newline preceded by a backslash.
每行前或后添加内容
- ‘i\[text]’ - 每一行之前都添加内容([text])
1 | sed 'i\100:C++,QQQ' test.txt |
Output:
1 | 100:C++,QQQ |
同样地,也可以在每一行后面添加内容
- ‘a\[text]’ - 每一行之后都添加内容([text])
1 | sed 'a\100:C++,QQQ' test.txt |
指定行前或后添加内容
在指定的行之前插入内容
- ‘<n>i\[text]’ - 在第n行之前插入内容([text])
1 | sed '1i\100:C++,QQQ' test.txt |
Output:
1 | 100:C++,QQQ |
同样地,也可以在指定的行之后插入内容
- ‘<n>a\[text]’ - 在第n行之后插入内容([text])
1 | sed '1a\100:C++,QQQ' test.txt |
注意,由于你的操作都是在缓冲区完成的,并且缓冲区的内容没有写入到原文件,因此这些上述sed命令并不会修改原文件。
匹配指定文本的行前或后添加内容
-
‘/[match_text…]/i\[text]’ - 在匹配了指定文本([match_text…])的行之前添加内容([text])
-
‘/[match_text…]/a\[text]’ - 在匹配了指定文本([match_text…])的行之后添加内容([text])
1 | sed '/Java/i\222:C#,GGG' test.txt |
交互式添加内容
1 | 输入sed 'li\然后回车就可以添加内容了 |
使用sed删除内容
删除单行
- ‘<n>d’ - 删除第n行
1 | sed '1d' test.txt |
删除多行
- ‘<n,m>d’ - 删除第n~m行
1 | sed '1,3d' test.txt |
- ‘<n1>d;<n2>d;…;<ni>d’ - 删除输入文件内容的多行,行号自定义
1 | sed '1d;3d;5d' test.txt |
删除匹配指定文本的行
- ‘/[match_text…]/d’ - 删除匹配文本的行
1 | 删除包含"Java"的行 |
- ‘/[match_text_1…]/,/[match_text_2…]d’ - 删除匹配文本1和文本2之间的所有行
1 | 删除包含"Java"和"PHP"之间的行 |
- ‘/[match_text_1…]/d;/[match_text_2…]/d;…;/[match_text_i…]/d’ - 删除匹配指定文本的多行
1 | 删除包含"Java"和"PHP"的行 |
一个重要应用实例:(删除空行内容,See also: How to delete empty lines using sed command under Linux / UNIX - nixCraft)
1 | cat test.txt |
使用sed查找并替换(处理结果不写入原文件)
查找并替换
一般语法格式如下
1 | sed [不包括('-i')的OPTIONS...] 's/SEARCH_REGEX/REPLACEMENT/g' INPUTFILE |
'g’为全局替换标志,如果不添加这个标志,只替换每行匹配的第一个信息
替换所有行
一般语法格式如下
1 | sed [不包括('-i')的OPTIONS...] 'c\REPLACEMENT' INPUTFILE |
举例
1 | sed 'c\Hello world!' test.txt |
替换指定行
- ‘<n>c\[REPLACEMENT]’ - 第n行的内容替换为REPLACEMENT
1 | sed '1c\Hello world!' test.txt |
- ‘<n>,<\m>c\[REPLACEMENT]’ - 第n~m行的内容替换为REPLACEMENT(注意:替换后多行内容变成一行内容)
1 | sed '1,3c\Hello world!' test.txt |
使用sed查找并替换字符串
Reference: How to Use sed to Find and Replace String in Files | Linuxize
替换用法的语法格式
The general form of searching and replacing text using
sed
takes the following form:
1 sed -i 's/SEARCH_REGEX/REPLACEMENT/g' INPUTFILE
-i
- By default,sed
writes its output to the standard output. This option tellssed
to edit files in place. If an extension is supplied (ex -i.bak), a backup of the original file is created.s
- The substitute command, probably the most used command in sed./ / /
- Delimiter character. It can be any character but usually the slash (/
) character is used.SEARCH_REGEX
- Normal string or a regular expression to search for.REPLACEMENT
- The replacement string.g
- Global replacement flag. By default,sed
reads the file line by line and changes only the first occurrence of theSEARCH_REGEX
on a line. When the replacement flag is provided, all occurrences are replaced.INPUTFILE
- The name of the file on which you want to run the command.It is a good practice to put quotes around the argument so the shell meta-characters won’t expand.
假定存在这样的一个文件内容:
file.txt
1 | 123 foo foo |
指定是否全局匹配
g
标志没有加上,则只有每行第一个匹配的字符串被替换
1 | sed -i 's/foo/linux/' file.txt |
file.txt输出内容如下
1 | 123 Foo linux foo |
替换时是否创建备份
如果在-i
后面加上后缀(比如.bak),则替换时会创建一个备份
1 | sed -i.bak 's/foo/linux/' file.txt |
1 | [root@localhost ~]# ls -l file* |
g
标志加上,则每行所有匹配上的字符串都会被替换
1 | sed -i 's/foo/linux/g' file.txt |
Output:
1 | 123 Foo linux linux |
可以看到"foobar"被替换为"linuxbar",如果不想要这些情况发生,需要在SEARCH_REGEX
两边加上单词边界表达式(\b
),这样匹配字符串时会检查字符串是不是完整的一个单词
1 | sed -i 's/\bfoo\b/linux/g' file.txt |
Output:
1 | 123 Foo linux linux |
如果想要匹配规则对大小写不区分,可以加上I
标志,以下例子加上了g
和I
标志
1 | sed -i 's/foo/linux/gI' file.txt |
Output:
1 | 123 linux linux linux |
匹配转义字符
如果涉及匹配转义字符(比如分界符/
),那么你需要在转义字符之前加上反斜线(\
)进行转义。以下例子会把"/bin/bash"匹配为"/usr/bin/zsh"
1 | sed -i 's/\/bin\/bash/\/usr\/bin\/zsh/g' file.txt |
Delimiter character不仅可以使用’/‘,你可以使用其他分界符让匹配字符串和替换字符串可读性更强,比如’#‘,’|’
1 | sed -i 's|\/bin\/bash|\/usr\/bin\/zsh|g' file.txt |
Output:
1 | 123 Foo linux foo |
使用正则表达式
SEARCH_REGEX
中可以使用正则表达式,比如将"456"和"123"这样的三位数字替换为"number"
1 | sed -i 's/\b[0-9]\{3\}\b/number/g' file.txt |
Output:
1 | number Foo foo foo |
可以使用$
代表SEARCH_REGEX
(匹配字符串)
1 | sed -i 's/\b[0-9]\{3\}\b/{&}/g' file.txt |
Output:
1 | {123} Foo foo foo |
该在该例中’$'代表了三位数字(其正则表达式"\b[0-9]{3}\b")
指定’-r’选项后sed命令支持处理扩展正则表达式(ERE):
1 | sed -r [OTHER_OPTIONS...] '[SCRIPT...]' [INPUT_FILE] |
案例
取出主机ip地址的实例
取出32位IP地址
1 | ip a s ens33 |
批量修改文件扩展名的实例
将"violet01.txt"~“violet10.txt"修改为"violet01.jpg”~“violet10.jpg”
1 | touch violet{01..10}.txt |
其他方法
1 | '&'代表替换前的内容,该用法有些像xargs -i cp {} ~\中的'{}',有点类似后项引用前项的感觉。 |
对于重命名,可以使用rename命令
重命名语法格式
1 | rename [options] expression replacement file... |
expression - 原文件需要修改的部分
replacement - 修改的内容
file… - 输入的原文件
1 | rename .txt .jpg violet*.txt |