shell中的文本操作_shell 纯文本怎么-优快云博客

本文链接：https://blog.youkuaiyun.com/qq_15956753/article/details/128524633

本文介绍了Linux命令行中常用的文本处理工具，如grep用于搜索模式，sed进行文本替换和编辑，awk处理文本和数据，以及read、find和sort、cut等命令的用法。这些工具支持正则表达式，能高效地对文件内容进行查找、筛选、替换和排序。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

文本操作

元字符

主要的元字符

() #标记子表达式的开头和结尾
{} #标记限定符表达式的开头和结尾
. #除换行符以外的所有字符。
^ #字符串开头。
$ #字符串结尾。
\d,\w,\s #匹配数字、字符、空格。
\D,\W,\S #匹配非数字、非字符、非空格。
[abc] #匹配 a、b 或 c 中的一个字母。
[a-z] #匹配 a 到 z 中的一个字母。
[^abc] #匹配除了 a、b 或 c 中的其他字母。
aa|bb #匹配 aa 或 bb。
? #0 次或 1 次匹配。
* #匹配 0 次或多次。
+ #匹配 1 次或多次。
{n} #匹配 n次。
{n,} #匹配 n次以上。
{m,n} #最少 m 次，最多 n 次匹配。
(expr) #捕获 expr 子模式,以 \1 使用它。
(?:expr) #忽略捕获的子模式。
(?=expr) #正向预查模式 expr。
(?!expr) #负向预查模式 expr。

简单示例

#匹配以一个或多个数字开头，以abc结尾的字符串
/^[0-9]+abc$/
#匹配以小写字母、数字、-和_组成的长度为3-15的字符串
/^[a-z0-9-_]{3,15}$/
#匹配任意长度的正整数
/^[1-9][0-9]*/
#匹配1-99的正整数
/^[1-9][0-9]?/
/[1-9][0-9]{0,1}/
#贪婪匹配<>
/<.*>/
    <h1>runoob.com</h1>
#最小匹配<>
/<.*?>/
    <h1>
    </h1>
#匹配以Chapter开头，两位数结尾的字符
/^Chapter [1-9][0-9]$?/

read 读取标准输入

实例

#普通输入
read var
echo $var
#带提示
read -p "please enter a character:" var
echo $var
#-t 参数指定 read 命令等待输入的秒数，当计时满时，read命令返回一个非零退出状态
read -t 5 -p "please enter a character in 5 seconds:" var
echo $var

部分选项

-a 后跟一个变量，该变量会被认为是个数组，然后给其赋值，默认是以空格为分割符。
-d 后面跟一个标志符，其实只有其后的第一个字符有用，作为结束的标志。
-p 后面跟提示信息，即在输入前打印提示信息。
-e 在输入的时候可以使用命令补全功能。
-n 后跟一个数字，定义输入文本的长度，很实用。
-r 屏蔽\，如果没有该选项，则\作为一个转义字符，有的话 \就是个正常的字符了。
-s 安静模式，在输入字符时不再屏幕上显示，例如login时输入密码。
-t 后面跟秒数，定义输入字符的等待时间。
-u 后面跟fd，从文件描述符中读入，该文件描述符可以是exec新开启的。

TLDR

  read

  BASH builtin for retrieving data from standard input.
  More information: https://manned.org/read.1p.

  - Store data that you type from the keyboard:
    read variable

  - Store each of the next lines you enter as values of an array:
    read -a array

  - Specify the number of maximum characters to be read:
    read -n character_count variable

  - Use a specific character as a delimiter instead of a new line:
    read -d new_delimiter variable

  - Do not let backslash (\) act as an escape character:
    read -r variable

  - Display a prompt before the input:
    read -p "Enter your input here: " variable

  - Do not echo typed characters (silent mode):
    read -s variable

  - Read stdin and perform an action on every line:
    while read line; do echo "$line"; done

find 文件搜索

常用选项

# -name：按照文件名搜索
# -iname：按照文件名搜索，不区分文件名大小写
# -size[+|-]大小：按照指定大小搜索文件
# -atime[+|-]时间：按照文件访问时间搜索
# -mtime[+|-]时间：按照文件数据修改时间搜索
# -ctime[+|-]时间：按照文件状态修改时间搜索
# -perm: 权限模式：查找文件权限刚好等于“权限模式”的文件
# -perm: -权限模式：查找文件权限全部包含“权限模式”的文件
# -perm: +权限模式：查找文件权限包含“权限模式”的任意一个权限的文件
# -type d：查找目录
# -type f：查找普通文件
# -type l：查找软链接文件
# -maxdepth 2:查找目录深度

常见用法

# 查找当前目录下名字为passwd的文件
find . -type f -name passwd
# 查找当前目录下名字为passwd的文件，忽略大小写
find . -type f -iname passwd
# 查找当前目录下以.txt结尾的文件
find . -type f -name *.txt
# 在当前目录下查找文件大于50M
find . -size +50M
# 在当前目录下查找文件大小为28k的文件
find . -size 28k
# 查找文件大于50M 并且小于100M
find . -size +50M -size -100M
# 查找当前目录下名字为test.sh的文件，并复制到home目录下
find . -type f -name "test.sh" | xargs -i cp {} /home
find . -type f -name "test.sh" -exec cp {} /home \;

TLDR

  find

  Find files or directories under the given directory tree, recursively.
  More information: https://manned.org/find.

  - Find files by extension:
    find root_path -name '*.ext'

  - Find files matching multiple path/name patterns:
    find root_path -path '**/path/**/*.ext' -or -name '*pattern*'

  - Find directories matching a given name, in case-insensitive mode:
    find root_path -type d -iname '*lib*'

  - Find files matching a given pattern, excluding specific paths:
    find root_path -name '*.py' -not -path '*/site-packages/*'

  - Find files matching a given size range:
    find root_path -size +500k -size -10M

  - Run a command for each file (use `{}` within the command to access the filename):
    find root_path -name '*.ext' -exec wc -l {} \;

  - Find files modified in the last 7 days and delete them:
    find root_path -daystart -mtime -7 -delete

  - Find empty (0 byte) files and delete them:
    find root_path -type f -empty -delete

grep 文本搜索

REGEXP

^    	# 锚定行的开始 如：'^grep'匹配所有以grep开头的行。    
$    	# 锚定行的结束 如：'grep$' 匹配所有以grep结尾的行。
.    	# 匹配非换行符的字符 如：'gr.p'匹配gr后接一个任意字符，然后是p。    
*    	# 匹配零个或多个先前字符 如：'*grep'匹配所有一个或多个空格后紧跟grep的行。    
.*   	# 一起用代表任意字符。   
[]   	# 匹配指定范围内的字符，如'[Gg]rep'匹配Grep和grep。    
[^]  	# 匹配在指定范围内的字符，如：'[^A-FH-Z]rep'匹配不包含A-R和T-Z的字母开头，紧跟rep的行。 
\(..\)  # 标记匹配字符，如'\(love\)'，love被标记为1。    
\<      # 锚定单词的开始，如:'\<grep'匹配包含以grep开头的单词的行。    
\>      # 锚定单词的结束，如'grep\>'匹配包含以grep结尾的单词的行。    
x\{m\}  # 重复字符x，m次，如：'0\{5\}'匹配包含5个o的行。    
x\{m,\} # 重复字符x,至少m次，如：'o\{5,\}'匹配至少有5个o的行。    
x\{m,n\}# 重复字符x，至少m次，不多于n次，如：'o\{5,10\}'匹配5--10个o的行。   
\w    	# 匹配文字和数字字符，如：'G\w*p'匹配以G后跟零个或多个文字或数字字符，然后是p。   
\W    	# \w的反置形式，匹配一个或多个非单词字符，如点号句号等。   
\b    	# 单词锁定符，如: '\bgrep\b'只匹配grep。

常见用法

# 从文本中搜索root所在的行
grep root passwd
# 从文本中搜索没有root的行
grep -v root passwd
# 输出文本中有root的行数
grep -c root passwd
# 带行标输出带有root的行
grep -n root passwd
# 带坐标输出带有root的行
grep -nb root passwd

进阶用法

# 查找文件中以root开头的行，并输出行标
grep -n "^root" passwd
# 查找文件中以root结尾的行，忽略大小写
grep -i "root" passwd
# 查找文件中没有包含root的行
grep -v "root" passwd
# 查找文件中包含单词root的行
grep -w "root" passwd
# 查找空行
grep "^$" passwd
# 查找以字母n或字母s开头的行
grep "^[ns]" passwd
# 查找包含数字的行
grep [0-9] passwd

部分参数

# 除了显示符合范本样式的那一列之外，并显示该行之后的内容
-A<显示行数>
# 除了显示符合样式的那一行之外，并显示该行之前的内容
-B<显示行数>
# 除了显示符合样式的那一行之外，并显示该行之前后的内容。
-C<显示行数>

TLDR

  grep

  Find patterns in files using regular expressions.
  More information: https://www.gnu.org/software/grep/manual/grep.html.

  - Search for a pattern within a file:
    grep "search_pattern" path/to/file

  - Search for an exact string (disables regular expressions):
    grep --fixed-strings "exact_string" path/to/file

  - Search for a pattern in all files recursively in a directory, showing line numbers of matches, ignoring binary files:
    grep --recursive --line-number --binary-files=without-match "search_pattern" path/to/directory

  - Use extended regular expressions (supports `?`, `+`, `{}`, `()` and `|`), in case-insensitive mode:
    grep --extended-regexp --ignore-case "search_pattern" path/to/file

  - Print 3 lines of context around, before, or after each match:
    grep --context|before-context|after-context=3 "search_pattern" path/to/file

  - Print file name and line number for each match:
    grep --with-filename --line-number "search_pattern" path/to/file

  - Search for lines matching a pattern, printing only the matched text:
    grep --only-matching "search_pattern" path/to/file

  - Search stdin for lines that do not match a pattern:
    cat path/to/file | grep --invert-match "search_pattern"

sort 排序

实例

#普通排序
sort /etc/passwd
#忽略相同行排序
sort -u /etc/passwd
#忽略字母，单独按照数字排序
sort -n /etc/passwd
#倒序
sort -r /etc/passwd
#指定第三列为关键字
sort -k3 /etc/passwd
#指定分隔符
sort -t: /etc/passwd
#从用户名的第二个字母开始排列
sort -t: -k1.2 /etc/passwd
#从用户名的第二个字母开始排列，如果相同则按照用户ID数字开始排列
sort
#分隔符为 ，以第三列数字倒序排列，如果相同，则以第二轮升序排序
sort -t' '  -nr -k3 -k2 facebook.txt

TLDR

  sort

  Sort lines of text files.
  More information: https://www.gnu.org/software/coreutils/sort.

  - Sort a file in ascending order:
    sort path/to/file

  - Sort a file in descending order:
    sort --reverse path/to/file

  - Sort a file in case-insensitive way:
    sort --ignore-case path/to/file

  - Sort a file using numeric rather than alphabetic order:
    sort --numeric-sort path/to/file

  - Sort `/etc/passwd` by the 3rd field of each line numerically, using ":" as a field separator:
    sort --field-separator=: --key=3n /etc/passwd

  - Sort a file preserving only unique lines:
    sort --unique path/to/file

  - Sort a file, printing the output to the specified output file (can be used to sort a file in-place):
    sort --output=path/to/file path/to/file

  - Sort numbers with exponents:
    sort --general-numeric-sort path/to/file

cut 文本过滤

用法说明

N-		# 从第N个字符到结尾
N-M		# 从第N个字符到第M个字符
-M		# 从开始到第M个字符
-b		# 处理的是字节
-c		# 处理的是字符
-f		# 处理的是字段
-d       # 自定义分隔符，默认为制表符

一般用法

#输出文件中的第三列
cut -f3 passwd
#指定分隔符号
cut -f3 -d: passwd
#输出文件中的第一第三列
cut -f1,3 -d: passwd
#输出文件中的第三列到第五列
cut -f1-3 -d: passwd
#输出文件中除了第四五六列的内容
cut -f1,3 -d: --complement passwd
#输出文件中的前三列内容
cut -f-3 -d: passwd

#输出文件中从第三列开始到最后一列的内容
cut -f3- -d: passwd
#输出每行中的第三到第五个字符
cut -c3-5 passwd
#输出每行中的第三个和第五个字符
cut -c3,5 passwd
#输出每行中的前五个字符
cut -c-5 passwd
#输出每行中从第三个开始到最后一个字符
cut -c3- passwd

TLDR

  cut

  Cut out fields from stdin or files.
  More information: https://www.gnu.org/software/coreutils/cut.

  - Cut out the first sixteen characters of each line of stdin:
    cut -c 1-16

  - Cut out the first sixteen characters of each line of the given files:
    cut -c 1-16 file

  - Cut out everything from the 3rd character to the end of each line:
    cut -c 3-

  - Cut out the fifth field of each line, using a colon as a field delimiter (default delimiter is tab):
    cut -d':' -f5

  - Cut out the 2nd and 10th fields of each line, using a semicolon as a delimiter:
    cut -d';' -f2,10

  - Cut out the fields 3 through to the end of each line, using a space as a delimiter:
    cut -d' ' -f3-

sed 文本替换

sed元字符集

^ # 匹配行开始，如：/^sed/匹配所有以sed开头的行。
$ # 匹配行结束，如：/sed$/匹配所有以sed结尾的行。
. # 匹配一个非换行符的任意字符，如：/s.d/匹配s后接一个任意字符，最后是d。
* # 匹配0个或多个字符，等价于{0,}，如：/*sed/匹配所有模板是一个或多个空格后紧跟sed的行。
[] # 匹配一个指定范围内的字符，如/[sS]ed/匹配sed和Sed。  
[^] # 匹配一个不在指定范围内的字符，如：/[^A-RT-Z]ed/匹配不包含A-R和T-Z的一个字母开头，紧跟ed的行。
\(..\) # 匹配子串，保存匹配的字符，如s/\(love\)able/\1rs，loveable被替换成lovers。
& # 保存搜索字符用来替换其他字符，如s/love/ **&** /，love这成 **love** 。
\< # 匹配单词的开始，如:/\<love/匹配包含以love开头的单词的行。
\> # 匹配单词的结束，如/love\>/匹配包含以love结尾的单词的行。
x\{m\} # 重复字符x，m次，如：/0\{5\}/匹配包含5个0的行。
x\{m,\} # 重复字符x，至少m次，如：/0\{5,\}/匹配至少有5个0的行。
x\{m,n\} # 重复字符x，至少m次，不多于n次，如：/0\{5,10\}/匹配5~10个0的行。

打印模式：p print

#每一行输出两次
sed 'p' passwd
#每一行输出一次
sed -n '1p' passwd
#输出第一到三行
sed -n '1,3p' passwd
#打印Jason所在行
sed -n '/Jason/p' passwd
#打印Jason至Jane所在行
sed -n '/Jason/,/Jane/p' passwd
#打印103开头的行
sed -n '/^103/p' passwd
#打印以manager结尾的行
sed -n '/manager$/p' passwd
#以连续三个数字开头的行
sed -n '/[^0-9]\{3\}/p' passwd
#从第一行开始打印，步长2
sed -n '1~2p' passwd
#从第二行开始打印，步长3
sed -n '2~3p' passwd
#打印uid为0或1的行
sed -n '/x:[0|1]:/p' passwd

替换操作：s search

sed '[address-range|pattern-range] s/originalstring/replacement-string/[substitute-flags]' inputfile

#把每一行的第一个：替换成-
sed 's/:/-/' /etc/passwd
#把所有的：替换成-
sed 's/:/-/g' /etc/passwd
#把文件中所有的：替换成-，结果写入文件
sed -i 's/:/-/g' /etc/passwd
#把文件中所有的：替换成-，备份原文件，并把结果写入文件
sed -i.bak 's/:/-/g' /etc/passwd
#把文件中带有root的行整行删除
sed '/root/d' /etc/passwd
#把文件中所有的root都替换成---------
sed '/root/ s/root/---------/g' passwd
#多次匹配，把：和x替换成-
sed -e 's/:/-/g' -e 's/x/-/g' passwd
#把分隔符换成#
sed 's#:#-#g' passwd
#从第三个匹配到的字符开始替换
sed 's/:/-/3g' passwd
#只替换第三个匹配到的字符
sed 's/:/-/3' passwd
#把第二到第五行的manager替换成guozi
sed '2,5 s/manager/guozi/' passwd
#把john所在字符串的行里的manager替换成guozi
sed '/john/ s/manager/guozi/' passwd
#把第四行且包含字符串ram的行中，把developer替换成user
sed '4,/ram/ s/developer/user/' passwd
#把带有john的行，给john替换成[john]
sed 's/john/[&]/' passwd
#把以三个小写字母开头的行，加上{}
sed 's/^[a-z]\{3,\}/{&}/' passwd
#删除所有行的第一个字符
sed 's/^.//' passwd
#删除所有行的第三个字符
sed 's/.//3' passwd
#匹配所有的注释行中带#号的内容，把，后面的内容都删掉
sed '/#/ s/,.*//' passwd
#在每行的结尾加上#
sed 's/$/#/' passwd
#在ifconfig的信息中，找到各网卡对应的ipv4地址，并用{}标记
ifconfig | sed 's/\([0-9]\{1,3\}\.\)\{3\}[0-9]\{1,3\}/{&}/'

删除操作：d delete

#删除空白行
sed '/^$/d' passwd
#删除文件的第二行
sed '2d' passwd
#删除文件第二行到末尾所有行
sed '2,$d' passwd
#删除第二行到第四行之外的所有行
sed '2，4!d' passwd
#删除文件最后一行
sed '$d' passwd
#删除文件中所有test开头的行
sed '/^test/d' passwd
#删除所有的注释和空白行
sed -e '/^$/d' -e '/^#.*/d' passwd
sed '/^$/d;/^#.*/d' passwd
#附加行号并且删除第二到第五行
nl passwd | sed '2,5d'
#删除从adm到最后一行
sed '/adm/,$d' passwd
#删除adm及其后一行
sed '/adm/,+1d' passwd
#删除1-5行中匹配root的行
sed '1,5{/root/d}' passwd

附加操作：a append

sed '[address] a the-line-to-append' input-file
#在文件的第二行后加多一行------------------
sed '2 a ----------------' passwd
#在文件的第二行到第五行（包括第五行）之间的行后加多一行------------------
sed '2,5 a ------------------' passwd
#在文件的最后一行后加多一行-----------------
sed '$ a ----------------' passwd
#在root所在行后加一行--------------
sed '/root/ a -------------' passwd

插入操作：i insert

# sed '[address] i the-line-to-insert' input-file
#在文件的第二行前加多一行------------------
sed '2 i ----------------' passwd
#在文件的第二行到第五行（包括第五行）之间的行前加多一行------------------
sed '2,5 i ------------------' passwd
#在文件的最后一行前加多一行-----------------
sed '$ i ----------------' passwd
#在root所在行前加一行--------------
sed '/root/ i -------------' passwd

替换命令：c change

# sed '[address] i the-line-to-insert' input-file
#将文件的第二行替换为------------------
sed '2 c ----------------' passwd
#将文件的第二行到第五行（包括第五行）之间的行替换为------------------
sed '2,5 c ------------------' passwd
#将文件的最后一行替换为-----------------
sed '$ c ----------------' passwd
#将包含root的行替换为--------------
sed '/root/ c -------------' passwd

匹配内容标记

# 匹配内容标记
# 用正整数标记同一行中匹配的内容次数，g标记为全局标记
# 将每行第二个匹配到的dog替换为cat，其余匹配到的dog不做改动
echo "dog dog dog dog dog" | sed 's/dog/cat/2' 
# 将每行所有匹配到的dog都替换为cat
echo "dog dog dog dog dog" | sed 's/dog/cat/g'

其他命令

# 转换操作：y 
# 将文本内的abc分别转换为ABC
echo "abcadbdcddd" | sed 'y/abc/ABC/'

# 外部命令-f file 将命今写入文本中，适用于需要重复执行多次的今的情况 
# 将命令写入文本
vim script file.txt
s/brown/white/gs
s/dog/cat/g
s/lazy/sleeping/g
# 通过-t调用文本中的命令
sed -f file.txt passwd

# 外部命令-i in place 替换文本文件
# 将修改后的内容直接写入源文件并替换
# 将源文件加一个后缀名.bak作为备份，新修改的内容写入源文件并替换
sed -i.bak 's/root/change/' passwd

# 将修改过的内容单独保存到filename文件中
# 只会保存修改过的行内容
sed 's/root/change/w change.txt' passwd
# 全局修改可以结合w指令一起使用，但是g必须在w的前面
sed 's/root/change/gw change.txt' passwd

小技巧

# 小技巧
# 统计文本有多少行，=可以用来输出行号，但是是在新行插入行号
sed -n '$=' passwd
awk 'END{print NR}' passwd
# 输出文本内容时在每行后加入行号
sed -n '=' passwd
awk '{print NR"\t"$0}' passwd
# 输出包含the的行的行号
sed '/the/=' passwd
# 输出以PI开头的行
sed -n '/^mysql/p' passwd
# 输出以数字结尾的行
sed -n '/[0-9]$/p' passwd
# 输出包含单词wood的行，\<,\>用来划定单词的边界
sed -n '/\<wood\>/' passwd
# 输出以wood开头的行
sed -n '/\<wood/p' passwd
# 输出以wood结尾的行
sed -n '/wood\>/p' passwd
# 每行开头添加#
sed 's/^/#/' passwd
# 在包含the的每一行前添加#
sed -n '/the/ s/^/#/' passwd
# 在每行末尾添加EOF字符
sed 's/$/EOF/' passwd
# 将3-5行的所有the替换为THE
sed '3,5 s/the/THE/g' passwd
# 将包含the的行中的o替换为O
sed '/the/ s/o/O/g' passwd

文本迁移

# 迁移符合条件的文本
# H 复制到剪切板
# g 将剪切板中的数据覆盖
# G 将剪切板中的数据追加到指定行
# w 保存为文件
# r 读取指定文件
# a 追加指定内容

# H复制到剪切板--d删除--$G追加到文末
cat passwd | head -5 | sed '/daemon/H;/daemon/d;$G'
cat passwd | head -5 | sed '/daemon/{H,d};$G'
# 将1-5行许移到17行后
cat -n passwd | sed '1,5H; 1,5d; 17G' 
# 将包含the的行另存为新文件
sed '/the/w text.txt' passwd
# 在包含the每行后添加hostname
sed '/the/ s/$/hostname/g' passwd
# 在第三行后插入一个空白行
sed '3G' passwd
# 在第三行后插入新行，内容为NEW
cat -n passwd | sed '3G;3 c NEW'
sed '3 a NEW' passwd

TLDR

  sed

  Edit text in a scriptable manner.
  More information: https://www.gnu.org/software/sed/manual/sed.html.

  - Replace the first occurrence of a regular expression in each line of a file, and print the result:
    sed 's/regular_expression/replace/' filename

  - Replace all occurrences of an extended regular expression in a file, and print the result:
    sed -r 's/regular_expression/replace/g' filename

  - Replace all occurrences of a string in a file, overwriting the file (i.e. in-place):
    sed -i 's/find/replace/g' filename

  - Replace only on lines matching the line pattern:
    sed '/line_pattern/s/find/replace/' filename

  - Delete lines matching the line pattern:
    sed '/line_pattern/d' filename

  - Print the first 11 lines of a file:
    sed 11q filename

  - Apply multiple find-replace expressions to a file:
    sed -e 's/find/replace/' -e 's/find/replace/' filename

  - Replace separator `/` by any other character not used in the find or replace patterns, e.g. `#`:
    sed 's#find#replace#' filename

awk 文本和数据处理

cut -d : -f 1 /etc/passwd
awk -F: '/wd$/{print $1}' /etc/passwd
#找到端口为22的所有pid
netstat -antup | grep :22 | awk '{print $7}' | sed 's|/.\{0,\}||'

内置变量

说明

NR			总行数 number of rows
NF			总列数 number of fields
$0			整行
$1			第一列的数据
$2			第二列的数据
$NF			最后一列
$(NF-1)		倒数第二列

实例

#输出每一行的行标
ps | awk '{print NR}'
#输出每一行的总列数
ps | awk '{print NF}'
#输出一整行
ps | awk '{print $0}'
ps | awk '{print}'
#输出每行的第一列
ps | awk '{print $1}'
#在每一行的前面附加当前行号和当前行的列数
ps | awk '{print NR "\t" NF "\t" $0}'
#输出每一行的最后一列
ps | awk '{print $NF}'
#输出每一行的倒数第二列
ps | awk '{print $(NF-1)}'
#输出每一行的第三列和第四列
ps | awk '{print $3,$4}'

外部变量

实例

VAR=1000
echo | awk -v VARIABLE=$VAR '{print VARIABLE}'

var1="aaa"
var2="bbb"
echo | awk '{print v1,v2}' v1=$var1 v2=$var2

运算和判断

实例1

#所用非操作数在用算数运算符操作时，自动转换为0
#算术运算符有+ - * / &求余 !逻辑非 ^求幂 ++ --
#赋值运算符= += -= *= /= %= ^=
awk 'BEGIN{a="b"; print a++, ++a}'
#||逻辑或 &&逻辑与
awk 'BEGIN{a=1;b=2;print (a>5 && b<2), (a>5 || b<=2);}'
#正则运算符 ~匹配 !~不匹配
awk 'BEGIN{a="100testa"; if(a ~ /^100*/){print "ok";}}'
#关系运算符 < <= > >= != ==
awk 'BEGIN{a=11;if(a >= 9){print "ok";}}'
awk 'BEGIN{a="b";print a=="b"?"ok":"err";}'
awk 'BEGIN{a="a";arr[0]="b";arr[1]="c";print (a in arr);}'

实例2

# NR number of row
# NF number of field
# FNR file number of row
# FS field separator
# RS record of separator
# ORS output record of separator
# FIELDWIDTHS field widths

# 直接输出一整行
awk '{print}' passwd
# $0 也有直接输出一整行的作用
awk '{print $0}' passwd
# 以冒号为分割符号，输出第一列
awk -F':' '{print $1}' passwd
# 以x为分割符号，输出第一列
awk -F'x' '{print $1}' passwd
# 设置多个分隔符号
awk -F[:/] '{print $1,$3}'
# 设置字段宽度，忽略分割符号的设置，按照字符长度划分字段
awk -F':' 'BEGIN{FIELDWIDTHS="5 10 10 5"} {print $1,$2,$5}' passwd
# 输出第一列和第三列，中间连在一起
awk -F':' '{print $1$3}' passwd
# 将空格用双引号引起来，可以输出双引号的内容
awk -F':' '{print $1" "$3}' passwd
# 或者在中间加逗号，也会输出空格
awk -F':' '{print $1,$3}' passwd
# 在两列中间插入制表符
awk -F':' '{print $1"\t"$3}' passwd


# 打印每一行的列数
awk -F':' '{print NF}' passwd
# 显示行号
awk -F':' '{print NR}' passwd
# 显示行号，并显示每一行的内容
awk -F':' '{print NR,$0}' passwd
# 打印第二行，不加print也行，默认就是打印
awk -F':' 'NR==2' passwd
awk -F':' 'NR==2 {print}' passwd
awk -F':' 'NR==2 {print $0}' passwd
# 打印第二行的第一列
awk -F':' 'NR==2 {print $1}' passwd
# 打印最后一列
awk -F':' '{print $NF}' passwd
# 打印总行数
awk -F':' 'END{print NR}' passwd 
# 打印文件的最后一行
awk 'END{print NR}' passwd
# 加上文字描述行数和列数
awk -F':' '{print "this is the "NR" line of the file."}' passwd 

# 匹配
# 精确匹配可以将要匹配的内容加上双引号，如上面有搜索包含root的行并输出的例子
awk -F':' '/root/ {print 0}' passwd
# 以冒号分割，将第一列中包含root的行打印出来
awk -F':' '$1~/root/ {print}' passwd
# 以冒号分割，将第五列包含o的行打印出来
awk -F':' '$5~/o/ {print}' passwd
# 以冒号分割，将第七列不包含nologin结尾的行，打印第一列和第七列
awk -F':' '$7!~/nologin/ {print $1,$7}' passwd 
# 搜索包含root的行，以冒号分割截取整行
awk -F':' '/root/ {print $0}' passwd
# 搜索包含root的行，以冒号分割截取第一列
awk -F':' '/root/ {print $1}' passwd
# 搜索包含root的行，以冒号分割截取第一列和第六列
awk -F':' '/root/ {print $1,$6}' passwd
# 搜索包含root的行
awk -F':' '/root/ {print}' passwd

# BEGIN和END
# 定义变量并对变量进行赋值运算
awk 'NR==2 {x=10;y=2;x=x+y;print x}' passwd
# x=10,输出x
awk 'NR==2 {x=10;print x}' passwd
# x=2,y=3，求x的y次方，输出x
awk 'NR==2 {x=2;y=3;x=x^y;print x}' passwd
# FS在打印之前定义分割符号为冒号
awk 'BEGIN{FS=":"} {print $1,$3}' passwd
# 分割符号的同作用
awk -F':' '{print $1,$3}' passwd
# OFS为输出字段的分隔符，可以在BEGIN中手动定义
awk 'BEGIN{OFS="----"} {print $1,$3,$5}' passwd
# RS指定文本内的冒号
awk 'RS=":" {print $0}' passwd
# 把多行合并成一行输出，输出的时候自定义以空格分割每行
awk 'ORS=":" {print $0}' passwd

# 关于数值和字符串的比较
# 输出前面三行，包括他们的行号
awk -F':' 'NR<=3 {print NR"\t"$0}' passwd
# 输出第三列为1的
awk -F':' '$3==0 {print}' passwd
# 精确匹配第三列为root的行
awk '/root/ {print}'  passwd
# 输出第三列大于等于10的行
awk -F':' '$3>=10 {print}' passwd
# 逻辑与和逻辑或
# 输出第三列小于5或大于1005的行
awk -F':' '$3<5 || $3>1005 {print}' passwd
# 输出第三列大于10且小于15的行
awk -F':' '$3>10 && $3<15 {print}' passwd
# 打印1-200间所有能被7整除并且包含数字7的整数数字
seq 200 | awk '$1%7==0 && $1~/7/ {print $1}'
awk 'END{for(num=1;num<200;num++){if(num%7==0 && num~/7/){print num}}}' passwd
# 输出所有第二列部位"1/1"和"2/2"的行
kubectl get pods -A | awk '{if($2=="1/1" && $2=="2/2") {print $0}}'


# 常见用法
# 查看本机的ip地址，并将其截取出来
ifconfig | grep -A8 eth0 | grep -w inet | awk '{print $2}' | sed 's/ //g'
# 查看本机下载流量有多少字节
ifconfig | grep -A8 eth0 | grep "RX" | grep bytes | awk '{print $5}' | sed 's/ //g'
# 查看本机上传和下载的流量字节总和
rx=ifconfig | grep -A8 eth0 | grep "RX" | grep bytes | awk '{print $5}' | sed 's/ //g'
tx=ifconfig | grep -A8 eth0 | grep "TX" | grep bytes | awk '{print $5}' | sed 's/ //g'
bytesSum=rx+tx
# 查看根分区可用容量
df -h | grep -w / | awk '{print $4}'

高级输入输出

TLDR

  awk

  A versatile programming language for working on files.
  More information: https://github.com/onetrueawk/awk.

  - Print the fifth column (a.k.a. field) in a space-separated file:
    awk '{print $5}' filename

  - Print the second column of the lines containing "foo" in a space-separated file:
    awk '/foo/ {print $2}' filename

  - Print the last column of each line in a file, using a comma (instead of space) as a field separator:
    awk -F ',' '{print $NF}' filename

  - Sum the values in the first column of a file and print the total:
    awk '{s+=$1} END {print s}' filename

  - Print every third line starting from the first line:
    awk 'NR%3==1' filename

  - Print different values based on conditions:
    awk '{if ($1 == "foo") print "Exact match foo"; else if ($1 ~ "bar") print "Partial match bar"; else print "Baz"}' filename

  - Print all lines where the 10th column value equals the specified value:
    awk '($10 == value)'

  - Print all the lines which the 10th column value is between a min and a max:
    awk '($10 >= min_value && $10 <= max_value)'