基础正则表达式及grep的使用

本文详细介绍了正则表达式的使用方法及其在文本搜索中的应用,包括查找特定字符、忽略大小写、关键字前后行显示、统计行数、集合字符匹配、反向查找、行首与行尾字符定位、空白符号处理、任意字符与重复字符匹配、限定连续字符范围等。通过实例演示了如何使用正则表达式进行高效搜索与筛选文本。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

regular_express.txt文件内容

<span style="font-size:14px;">"Open Source" is a good mechanism to develop programs.
apple is my favorite food.
Football game is not use feet only.
this dress doesn't fit me.
However, this dress is about $ 3183 dollars.^M
GNU is free air not free beer.^M
Her hair is very beauty.^M
I can't finish the test.^M
Oh! The soup taste good.^M
motorcycle is cheap than car.
This window is clear.
the symbol '*' is represented as start.
Oh!     My god!
The gd software is a library for drafting programs.^M
You are the best is mean you are the no. 1.
The world <Happy> is the same with "glad".
I like dog.
google is the best tools for search keyword.
goooooogle yes!
go! go! Let's go.
# I am VBird

</span>

1.查找特定字符

--color=auto 关键字显示颜色

grep  -n --color=auto'the' regular_express.txt

8:I can't finish <span style="color:#FF0000;">the</span> test.
12:<span style="color:#FF0000;">the</span> symbol '*' is represented as start.
15:You are <span style="color:#FF0000;">the</span> best is mean you are <span style="color:#FF0000;">the</span> no. 1.
16:The world <Happy> is <span style="color:#FF0000;">the</span> same with "glad".
18:google is <span style="color:#FF0000;">the</span> best tools for search keyword.

2.该行没有特定字符串时显示

grep -vn 'the' regular_express.txt

1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However, this dress is about $ 3183 dollars.
6:GNU is free air not free beer.
7:Her hair is very beauty.
9:Oh! The soup taste good.
10:motorcycle is cheap than car.
11:This window is clear.
13:Oh!	My god!
14:The gd software is a library for drafting programs.
17:I like dog.
19:goooooogle yes!
20:go! go! Let's go.
21:# I am VBird
22:

3.忽略大小写

grep -in 'the' regular_express.txt

8:I can't finish <span style="color:#FF0000;">the</span> test.
9:Oh! <span style="color:#FF0000;">The</span> soup taste good.
12:<span style="color:#FF0000;">the</span> symbol '*' is represented as start.
14:<span style="color:#FF0000;">The</span> gd software is a library for drafting programs.
15:You are <span style="color:#FF0000;">the</span> best is mean you are <span style="color:#FF0000;">the</span> no. 1.
16:<span style="color:#FF0000;">The</span> world <Happy> is <span style="color:#FF0000;">the</span> same with "glad".
18:google is <span style="color:#FF0000;">the</span> best tools for search keyword.

4.关键字的前面n行或后面n行

-A2 后续的2行也列出来

-B2 前面的2行也列出来

 grep -n -A2 -B2 'do' regular_express.txt

2-apple is my favorite food.
3-Football game is not use feet only.
4:this dress doesn't fit me.
5:However, this dress is about $ 3183 dollars.
6-GNU is free air not free beer.
7-Her hair is very beauty.
--
9-Oh! The soup taste good.
10-motorcycle is cheap than car.
11:This window is clear.
12-the symbol '*' is represented as start.
13-Oh!	My god!
--
15-You are the best is mean you are the no. 1.
16-The world <Happy> is the same with "glad".
17:I like dog.
18-google is the best tools for search keyword.
19-goooooogle yes!

5.统计有多少行

 grep -c 'the' regular_express.txt

5
6.利用中括号[]来查找集合字符

[]里面不论有几个字符,它都只代表某“一个”字符。

 grep -n 't[ae]st' regular_express.txt

8:I can't finish the <span style="color:#FF0000;">test</span>.
9:Oh! The soup <span style="color:#FF0000;">tast</span>e good.

7.反向查找集合字符

[^]反向选择查找

 grep -n '[^g]oo' regular_express.txt

<span style="color:#000000;">2:apple is my favorite food.
3:Football game is not use feet only.
18:google is the best tools for search keyword.
19:goooooogle yes!</span>
8.[a-z][A-Z][0-9]

当我们在一组集合字符中,如果该字符组是连续的,例如大写英文/小写英文/数字等,就可以使用[A-Z][a-z][0-9]等方式书写,如果要求字符串是

数字与英文,则写成[a-zA-Z0-9]

grep -n '[^a-z]oo' regular_express.txt

<span style="color:#000000;">3:Football game is not use feet only.
</span>


9.行首与行尾字符 ^ $

让查询的字符串在行首列出。

grep -n '^the' regular_express.txt

<span style="color:#000000;">12:the symbol '*' is represented as start.</span>
grep -n '^[a-z]' regular_express.txt

<span style="color:#000000;">2:apple is my favorite food.
4:this dress doesn't fit me.
10:motorcycle is cheap than car.
12:the symbol '*' is represented as start.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.</span>
^ 符号在字符集中(中括号[])之内与之外是不同的!在[]内代表“反向选择”,在[]之外则代表定位在行首的意义。
10 行尾字符 $

找出行尾结束字符为小数点(.)的那一行

grep -n '\.$' regular_express.txt

<span style="color:#000000;">1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
10:motorcycle is cheap than car.
11:This window is clear.
12:the symbol '*' is represented as start.
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
17:I like dog.
18:google is the best tools for search keyword.
20:go! go! Let's go.</span>
查询空白符号  ^$开头和结尾即空白

grep -n '^$' regular_express.txt 

<span style="color:#000000;">22:</span>
去除空白行和注释行显示

grep -v '^$' export-1.sh |grep -v '^#'
11.任意一个字符. 与重复字符*

.(小数点):代表绝对有一个字符的意思

*(星号):代表重复前一个0到无穷多次意思,为组合形态


查询g与d之间一定存在两个字符的行

<span style="color:#000000;">1:"Open Source" is a good mechanism to develop programs.
9:Oh! The soup taste good.
16:The world <Happy> is the same with "glad".
</span>
因为*代表的是重复0个或多个前面的RE字符的意义,因此,“o*”
代表的是具有空字符或一个o以上的字符,特别注意,因为允许空字符(没有字符都可以的意思),因此,

“grep -n 'o*' regular_express.txt” 将会把所有的数据都打印到屏幕上。

那如果是“oo*”呢?则第一个o肯定必须要存在,第二个o则是可有可无的多个oo,所以凡是含有o,oo,ooo,oooo等,都可以被列出来

同理,当我们需要至少两个o以上的字符串时,就需要ooo*,即

grep -n 'ooo*' regular_express.txt

<span style="color:#000000;">19:goooooogle yes!
zghw@zghw-Lenovo-B40-30:~/Desktop/bashdemo$ grep -n 'ooo*' regular_express.txt 
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
9:Oh! The soup taste good.
18:google is the best tools for search keyword.
19:goooooogle yes!</span>

.*组合则代表零个或多个任意字符的意思。

grep -n 'g.*g' regular_express.txt

<span style="color:#000000;">1:"Open Source" is a good mechanism to develop programs.
14:The gd software is a library for drafting programs.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.</span>

12.限定连续的RE字符范围 {}

限制一个范围区间内的重复字符数。

因为{与}的符号在shell是有特殊意义的,因此,我们必须要使用转义字符\来让它失去特殊意义才行。

grep -n 'o\{2\}' regular_express.txt

1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
9:Oh! The soup taste good.
18:google is the best tools for search keyword.
19:goooooogle yes!
 grep -n 'go\{2,5\}g' regular_express.txt

18:google is the best tools for search keyword.
grep -n 'go\{2,\}g' regular_express.txt
18:google is the best tools for search keyword.
19:goooooogle yes!

基础正则表达式字符
RE字符
意义与范例
  ^word
意义:待查找的字符串(word)在行首
范例:查找行首为#开始的那一行,并列出行号

grep -n '^#' regular_express.txt
  word$
意义:待查找的字符串(word)在行尾
范例:将行尾为!的那一行打印出来,并列出行号

grep -n '!$' regular_express.txt
  .
意义:代表一定有一个任意字符的字符
范例:查找的字符从可以是(eve)(eae)(eee)(ee),但不能仅有(ee) 即e与e中间“一定”仅有一个字符,而空格符也是字符

grep -n 'e.e' regular_express.txt
  \
意义:转义字符,将特殊符号的特殊意义去除
范例:查找含有单引号'的那一行

grep -n \' regular_express.txt
  *
意义:重复零个或无穷多个的前一个字符
范例:找出含有(es)(ess)(esss)等的字符串,注意,因为*可以是0个,所以es也是符合待查找字符串。另外,因为*为重复“前一个RE字符”的符号,因此,在*之前必须要紧接着一个RE字符哦!例如任意字符“.*”

grep -n 'ess*' regular_express.txt
  [list]
意义:从字符集合的RE字符里面找出想要选取的字符
范例:查找含有(gl)或者(gd)的那一行,需要特别留意的是,在[]当中代表一个待查找的字符,例如“a[afl]y”代表查找的字符串可以是aay,afy,aly,即[afl]代表a或f或l的意思

grep -n 'g[ld]' regular_express.txt
  [n1-n2]
意义:从字符集合的RE字符里面找出想要选取的字符范围
范例:查找含有任意数字的那一行。需特别留意,在字符集合[]中的减号-是有特殊意义的,它代表两个字符之间的所有连续字符。但这个连续与否与ASCII编码有官,因此,你的编码需要设置正确(在bash当中,需要确定LANG与LANGUAGE的变量是否正确)!例如所有大写字符则为[A-Z]

grep -n '[a-zA-Z0-9]' regular_express.txt
  [^list]
意义:从字符集合的RE字符里面找出不要的字符或范围
范例:查找的字符串可以是(oog)(ood) 但不能是(oot),那个^在[]内时代表的意义是“反向选择”的意思。例如我不要大写字符,则为[^A-Z]。但是需要特别注意的是,如果以grep -n [^A-Z] regular_express.txt来查找,却发现该文件内的所有行都被列出来,为什么?因为这个[^A-Z]是“非大写字符”的意思,因为每一行均有非大写字符

grep -n 'oo[^t]' regular_express.txt
  \{n,m\}^
意义:连续n到m个的前一个RE字符,若为\{n\}则是连续n个的前一个RE字符,若为\{n,\}则是连续n个以上的前一个RE字符
范例:在g与g之间有2个到3个的o存在的字符串。即(goog)(gooog)

grep -n 'go\{2,3\}g' regular_express.txt


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值