正则表达式就是处理字符串的方法,它是以行为单位进行字符串的处理行为,正则表达式是通过一些特殊字符的辅助,可以让用户轻易达到查找,删除,替换某定字符串的处理程序
正则表达式基本上是一种“表示法”,只要工具程序支持这种表示法,那么该工具程序就可以用来做正则表达式的处理用
注意:正则表达式与通配符是完全不同的东西,通配符是bash操接口的一个功能,正则表达式是处理字符串的一种方法
使用正则表达式时要特别留意当时环境的语系为何,否则可能会发现与别人不同的选取结果
扩展: #echo $LANG
zh_CN.UTF_8 简体中文编码
zh_TW.UTF_8 繁体中文编码
en_US.UTF_8 英文的字符编码
特殊字符
[[:alnum:]] 代表英文大小写字符及数字,即 0-9, A-Z,a-z
[[:alpha:]] 代表任何英文大小写字符,即 A-Z, a-z
[[:space:]] 任何会产生空白的字符,包括空白键, [Tab] 等等[[:digit:]] 代表数字,即 0-9
[[:lower:]] 代表小写字符,即 a-z
[[:upper:]] 代表大写字符,即 A
grep
-A 该行及其后n行
-B 该行及其前n行
–color 将正确选项列出颜色
eg1 用dmesg列出内核信息,再用grep找出含有eth那一行
[root@liu ~]# dmesg | grep 'eth'
[ 1.883825] e1000 0000:02:01.0 eth0: (PCI:66MHz:32-bit) 00:0c:29:70:a3:b8
[ 1.883831] e1000 0000:02:01.0 eth0: Intel(R) PRO/1000 Network Connection
承上,将eth的前三行与后两行也一起找出来
[root@liu ~]# dmesg | grep -A2 -B3 'eth'
[ 1.752168] sr 2:0:0:0: Attached scsi CD-ROM sr0
[ 1.758979] hub 2-2:1.0: USB hub found
[ 1.760394] hub 2-2:1.0: 7 ports detected
[ 1.883825] e1000 0000:02:01.0 eth0: (PCI:66MHz:32-bit) 00:0c:29:70:a3:b8
[ 1.883831] e1000 0000:02:01.0 eth0: Intel(R) PRO/1000 Network Connection
[ 2.051199] usb 2-2.1: new full-speed USB device number 4 using uhci_hcd
[ 2.079742] SGI XFS with ACLs, security attributes, no debug enabled
从已经写好的文件regular_express.txt 文本操作
eg2 从文本中找出带the的字符串并显示行号
[root@liu ~]# cat regular_express.txt | grep -n 'the'
8:I can't finish the test.^M
12:the symbol '*' is represented as start.
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
18:google is the best tools for search keyword.
反向选择
[root@liu ~]# cat regular_express.txt | grep -vn 'the'
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.this dress doesn't fit me.
4:this dress doesn't fit me
5:However, this dress is about $ 3183 dollars.^M
6:GNU is free air not free beer.^M
7:Her hair is very beauty.^M
9:Oh! The soup taste good.^M
10:motorcycle is cheap than car.
11:This window is clear.
13:Oh! My god!
14:The gd software is a library for drafting programs.^M
17:I like dog.
19:goooooogle yes!
20:go! go! Let's go.
21:# I am Vbird
反向选择加上-v的参数
获取忽略大小写the的字符串
[root@liu ~]# cat regular_express.txt | grep -i "the"
I can't finish the test.^M
Oh! The soup taste good.^M
the symbol '*' is represented as start.
The gd software is a library for drafting programs.^M
You are the best is mean you are the no. 1.
The world <Happy> is the same with "glad".
google is the best tools for search keyword.
eg3 利用[ ]来查找集合字符
>>3.1 要查找test或taste这两个字符
[root@liu ~]# grep 't[ae]st' regular_express.txt
I can't finish the test.^M
Oh! The soup taste good.^M
this is a test
[ ]里面的内容是或的意思
查找带有’oo’的字符
[root@liu ~]# grep "oo" regular_express.txt
"Open Source" is a good mechanism to develop programs.
apple is my favorite food.
Football game is not use feet only.this dress doesn't fit me.
Oh! The soup taste good.^M
google is the best tools for search keyword.
goooooogle yes!
不要oo前面带的字符
[root@liu ~]# grep -n '[^g]oo' regular_express.txt
2:apple is my favorite food.
3:Football game is not use feet only.this dress doesn't fit me.
18:google is the best tools for search keyword.
19:goooooogle yes!
^[ ]代表行首
[^ ] 反向选择
例如,只列出行首的the
[root@liu ~]# grep -n '^the' regular_express.txt
12:the symbol '*' is represented as start.
列出开头是小写字符的那一行
[root@liu ~]# cat regular_express.txt | grep -n '^[a-z]'
2:apple is my favorite food.
4:this dress doesn't fit me
10:motorcycle is cheap than car.
12:the symbol '*' is represented as start.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.
23:this is a test
列出以小数点结尾的那一行
[root@liu ~]# grep -n '\.$' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.this dress doesn't fit me.
10:motorcycle is cheap than car.
11:This window is clear.
12:the symbol '*' is represented as start.
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
17:I like dog.
18:google is the best tools for search keyword.
20:go! go! Let's go.
任意一个字符与重复字符
* 代表0或多个字符,但是正则表达式并不是通配符
. 代表一定有一个任意字符
eg3 找出g??d的字符
[root@liu ~]# grep -n 'g..d' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
9:Oh! The soup taste good.^M
16:The world <Happy> is the same with "glad".
[root@liu ~]#
找出又任意数字的行列
[root@liu ~]# grep -n '[0-9][0-9]*' regular_express.txt
5:However, this dress is about $ 3183 dollars.^M
15:You are the best is mean you are the no. 1.
ge5 限定连续RE字符范围{ }
{}在shell中具有特殊意义,所以要
>>eg5.1找到两个o的字符
[root@liu ~]# grep -n 'o\{2\}' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.this dress doesn't fit me.
9:Oh! The soup taste good.^M
18:google is the best tools for search keyword.
19:goooooogle yes!
>>eg5.2找到g后面接两到五个o的字符串
[root@liu ~]# grep -n 'o\{2,5\}' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.this dress doesn't fit me.
9:Oh! The soup taste good.^M
18:google is the best tools for search keyword.
19:goooooogle yes!
*在通配符中代表0到多个字符,在正则表达式中是重复0到多个前一个字符