正则表达式介绍
随着着过滤条件的复杂性的增加,WHERE子句本身的复杂性也有必要增加。就需要使用正则表达式。
正则表达式是用来匹配文本的特殊的串(字符集合)。
使用MySQL正则表达式
基本字符匹配
MariaDB [test]> select * from linux where passwd REGEXP '1';
+---------+--------+------+------+
| user | passwd | sex | age |
+---------+--------+------+------+
| user1 | 111 | boy | 18 |
| 1 | pass1 | girl | 30 |
| user1 | pass1 | girl | 18 |
| user11 | 1111 | girl | 31 |
| user111 | 111111 | boy | 41 |
+---------+--------+------+------+
5 rows in set (0.01 sec)
MariaDB [test]> select * from linux where passwd REGEXP '1' order by user,sex;
+---------+--------+------+------+
| user | passwd | sex | age |
+---------+--------+------+------+
| 1 | pass1 | girl | 30 |
| user1 | 111 | boy | 18 |
| user1 | pass1 | girl | 18 |
| user11 | 1111 | girl | 31 |
| user111 | 111111 | boy | 41 |
+---------+--------+------+------+
5 rows in set (0.00 sec)
除关键字LIKE
被REGEXP
替代外,这条语句看上去非常像使用LIKE的语句。它告诉MySQL:REGEXP后所跟的东西作为正则表达式(与文字正文1000匹配的一个正则表达式)处理。
MariaDB [test]> select * from linux where age REGEXP '.2' order by user,sex;
+-------+--------+-----+------+
| user | passwd | sex | age |
+-------+--------+-----+------+
| a | passa | boy | 22 |
| user3 | 333 | boy | 12 |
| user4 | 444 | boy | 22 |
+-------+--------+-----+------+
3 rows in set (0.00 sec)
MariaDB [test]> select * from linux where passwd REGEXP '.1' order by user,sex;
+---------+--------+------+------+
| user | passwd | sex | age |
+---------+--------+------+------+
| 1 | pass1 | girl | 30 |
| user1 | 111 | boy | 18 |
| user1 | pass1 | girl | 18 |
| user11 | 1111 | girl | 31 |
| user111 | 111111 | boy | 41 |
+---------+--------+------+------+
5 rows in set (0.00 sec)
.是正则表达式语言中一个特殊的字符。它表示匹配任意字符,
我们可以知道,LIKE 参数 和 REGEXP 的区别在于,REGEXP可以不使用通配符,当列中出现相应的字符时,就显示出来。
MariaDB [test]> select * from linux where passwd REGEXP '1';
+---------+--------+------+------+
| user | passwd | sex | age |
+---------+--------+------+------+
| user1 | 111 | boy | 18 |
| 1 | pass1 | girl | 30 |
| user1 | pass1 | girl | 18 |
| user11 | 1111 | girl | 31 |
| user111 | 111111 | boy | 41 |
+---------+--------+------+------+
5 rows in set (0.01 sec)
MariaDB [test]> select * from linux where passwd like '1';
Empty set (0.00 sec)
进行OR匹配
使用 | 来进行 OR 匹配
MariaDB [test]> select * from linux where user REGEXP '2|3';
+-------+--------+------+------+
| user | passwd | sex | age |
+-------+--------+------+------+
| user2 | 222 | girl | 23 |
| user3 | 333 | boy | 12 |
+-------+--------+------+------+
2 rows in set (0.00 sec)
MariaDB [test]> select * from linux where user REGEXP '2|3|4';
+-------+--------+------+------+
| user | passwd | sex | age |
+-------+--------+------+------+
| user2 | 222 | girl | 23 |
| user3 | 333 | boy | 12 |
| user4 | 444 | boy | 22 |
+-------+--------+------+------+
3 rows in set (0.00 sec)
MariaDB [test]> select * from linux where user REGEXP '2|3|4|5';
+-------+--------+------+------+
| user | passwd | sex | age |
+-------+--------+------+------+
| user2 | 222 | girl | 23 |
| user3 | 333 | boy | 12 |
| user4 | 444 | boy | 22 |
| user5 | 555 | boy | 30 |
+-------+--------+------+------+
4 rows in set (0.00 sec)
MariaDB [test]> select * from linux where user REGEXP '2|3|4|5|a';
+-------+--------+------+------+
| user | passwd | sex | age |
+-------+--------+------+------+
| user2 | 222 | girl | 23 |
| user3 | 333 | boy | 12 |
| user4 | 444 | boy | 22 |
| user5 | 555 | boy | 30 |
| a | passa | boy | 22 |
+-------+--------+------+------+
5 rows in set (0.00 sec)
匹配几个字符之一
可以理解为 [ ] 是 OR的缩写
MariaDB [test]> select * from linux where user REGEXP 'user[2]';
+-------+--------+------+------+
| user | passwd | sex | age |
+-------+--------+------+------+
| user2 | 222 | girl | 23 |
+-------+--------+------+------+
1 row in set (0.00 sec)
MariaDB [test]> select * from linux where user REGEXP 'user[23]';
+-------+--------+------+------+
| user | passwd | sex | age |
+-------+--------+------+------+
| user2 | 222 | girl | 23 |
| user3 | 333 | boy | 12 |
+-------+--------+------+------+
2 rows in set (0.00 sec)
MariaDB [test]> select * from linux where user REGEXP 'user[234]';
+-------+--------+------+------+
| user | passwd | sex | age |
+-------+--------+------+------+
| user2 | 222 | girl | 23 |
| user3 | 333 | boy | 12 |
| user4 | 444 | boy | 22 |
+-------+--------+------+------+
3 rows in set (0.00 sec)
MariaDB [test]> select * from linux where user REGEXP 'user 2|3|4';
+-------+--------+-----+------+
| user | passwd | sex | age |
+-------+--------+-----+------+
| user3 | 333 | boy | 12 | /像这种使用OR的方式 就是错误的,所以我们才会推出使用 []
| user4 | 444 | boy | 22 |
+-------+--------+-----+------+
2 rows in set (0.00 sec)
^反向匹配
MariaDB [test]> select * from linux where user REGEXP 'user[^234]';
+---------+--------+------+------+
| user | passwd | sex | age |
+---------+--------+------+------+
| user1 | 111 | boy | 18 |
| user5 | 555 | boy | 30 |
| user1 | pass1 | girl | 18 |
| user11 | 1111 | girl | 31 |
| user111 | 111111 | boy | 41 |
+---------+--------+------+------+
5 rows in set (0.00 sec)
匹配范围
集合可用来定义要匹配的一个或多个字符。
MariaDB [test]> select * from linux where user REGEXP 'user[^2-4]';
+---------+--------+------+------+
| user | passwd | sex | age |
+---------+--------+------+------+
| user1 | 111 | boy | 18 |
| user5 | 555 | boy | 30 |
| user1 | pass1 | girl | 18 |
| user11 | 1111 | girl | 31 |
| user111 | 111111 | boy | 41 |
+---------+--------+------+------+
5 rows in set (0.00 sec)
MariaDB [test]> select * from linux where user REGEXP 'user[^1-4]';
+-------+--------+-----+------+
| user | passwd | sex | age |
+-------+--------+-----+------+
| user5 | 555 | boy | 30 |
+-------+--------+-----+------+
1 row in set (0.00 sec)
这里的 1-4,2-4定义了一个范围
匹配特殊字符
匹配特殊字符,必须用\\
为前导。\\-
表示查找-
,\\.
表示查找.
MariaDB [test]> insert into linux value('user1.1','sad','boy',42);
Query OK, 1 row affected (0.00 sec)
MariaDB [test]> select * from linux where user REGEXP '.';
+---------+--------+------+------+
| user | passwd | sex | age |
+---------+--------+------+------+
| user1 | 111 | boy | 18 |
| user2 | 222 | girl | 23 |
| user3 | 333 | boy | 12 |
| user4 | 444 | boy | 22 |
| user5 | 555 | boy | 30 |
| a | passa | boy | 22 | /可见这种方式是行不通的
| 1 | pass1 | girl | 30 |
| user1 | pass1 | girl | 18 |
| user11 | 1111 | girl | 31 |
| user111 | 111111 | boy | 41 |
| user1.1 | sad | boy | 42 |
+---------+--------+------+------+
11 rows in set (0.00 sec)
MariaDB [test]> select * from linux where user REGEXP '\\.';
+---------+--------+-----+------+
| user | passwd | sex | age |
+---------+--------+-----+------+
| user1.1 | sad | boy | 42 |
+---------+--------+-----+------+
1 row in set (0.00 sec)
匹配字符类
MariaDB [test]> select * from linux where passwd REGEXP '[[:alpha:]]';
+---------+--------+------+------+
| user | passwd | sex | age |
+---------+--------+------+------+
| a | passa | boy | 22 |
| 1 | pass1 | girl | 30 | /匹配密码中还有任意字符的
| user1 | pass1 | girl | 18 |
| user1.1 | sad | boy | 42 |
+---------+--------+------+------+
4 rows in set (0.00 sec)
匹配多个实例
MariaDB [test]> insert into linux value ('(cay)','passcay','girl',21);
Query OK, 1 row affected (0.00 sec)
MariaDB [test]> insert into linux value ('(cays)','passcays','girl',21);
Query OK, 1 row affected (0.00 sec)
MariaDB [test]> select * from linux;
+---------+----------+------+------+
| user | passwd | sex | age |
+---------+----------+------+------+
| user1 | 111 | boy | 18 |
| user2 | 222 | girl | 23 |
| user3 | 333 | boy | 12 |
| user4 | 444 | boy | 22 |
| user5 | 555 | boy | 30 |
| a | passa | boy | 22 |
| 1 | pass1 | girl | 30 |
| user1 | pass1 | girl | 18 |
| user11 | 1111 | girl | 31 |
| user111 | 111111 | boy | 41 |
| user1.1 | sad | boy | 42 |
| (cay) | passcay | girl | 21 |
| (cays) | passcays | girl | 21 |
+---------+----------+------+------+
13 rows in set (0.00 sec)
MariaDB [test]> select * from linux WHERE user REGEXP '\\(cays?\\)';
+--------+----------+------+------+
| user | passwd | sex | age |
+--------+----------+------+------+
| (cay) | passcay | girl | 21 |
| (cays) | passcays | girl | 21 |
+--------+----------+------+------+
2 rows in set (0.00 sec)
\\(用来匹配(,\\)用来匹配),?代表cay后面的 s出现0 或者1次
MariaDB [test]> select * from linux WHERE user REGEXP 'user1{3}';
+---------+--------+-----+------+
| user | passwd | sex | age |
+---------+--------+-----+------+ /指定三次
| user111 | 111111 | boy | 41 |
+---------+--------+-----+------+
1 row in set (0.00 sec)
MariaDB [test]> select * from linux WHERE user REGEXP 'user1{2,}';
+---------+--------+------+------+
| user | passwd | sex | age |
+---------+--------+------+------+
| user11 | 1111 | girl | 31 | /二次以上
| user111 | 111111 | boy | 41 |
+---------+--------+------+------+
2 rows in set (0.00 sec)
MariaDB [test]> select * from linux WHERE user REGEXP '[[:digit:]]{3}';
+---------+--------+-----+------+
| user | passwd | sex | age |
+---------+--------+-----+------+
| user111 | 111111 | boy | 41 | /连续出现三个数字
+---------+--------+-----+------+
1 row in set (0.00 sec)
[[:digit:]]匹配数字,{3}匹配连续出现3次。
MariaDB [test]> select * from linux WHERE user REGEXP '[0-9][0-9][0-9]';
+---------+--------+-----+------+
| user | passwd | sex | age |
+---------+--------+-----+------+
| user111 | 111111 | boy | 41 |
+---------+--------+-----+------+
1 row in set (0.00 sec)
我们也可以这么写。
定位符
MariaDB [test]> select * from linux WHERE user REGEXP '^[a1\\(]';
+--------+----------+------+------+
| user | passwd | sex | age |
+--------+----------+------+------+
| a | passa | boy | 22 | /表示以a或 1或(开头的
| 1 | pass1 | girl | 30 |
| (cay) | passcay | girl | 21 |
| (cays) | passcays | girl | 21 |
+--------+----------+------+------+
4 rows in set (0.00 sec)
MariaDB [test]> select * from linux WHERE passwd REGEXP '[[:digit:]s]$';
+---------+----------+------+------+
| user | passwd | sex | age |
+---------+----------+------+------+
| user1 | 111 | boy | 18 |
| user2 | 222 | girl | 23 |
| user3 | 333 | boy | 12 |
| user4 | 444 | boy | 22 | /表示以数字或s结尾的
| user5 | 555 | boy | 30 |
| 1 | pass1 | girl | 30 |
| user1 | pass1 | girl | 18 |
| user11 | 1111 | girl | 31 |
| user111 | 111111 | boy | 41 |
| (cays) | passcays | girl | 21 |
+---------+----------+------+------+
10 rows in set (0.00 sec)
MariaDB [test]> select * from linux WHERE passwd REGEXP '[[:alpha:]5]$';
+---------+----------+------+------+
| user | passwd | sex | age |
+---------+----------+------+------+
| user5 | 555 | boy | 30 | /表示以字母或者5结尾的。
| a | passa | boy | 22 |
| user1.1 | sad | boy | 42 |
| (cay) | passcay | girl | 21 |
| (cays) | passcays | girl | 21 |
+---------+----------+------+------+
5 rows in set (0.00 sec)
^的双重用途 ^有两种用法。在集合中(用[和]定义),用它来否定该集合,否则,用来指串的开始处。
不使用数据库的测试,有返回1,没有返回0
MariaDB [test]> select 'hello' REGEXP '[[:digit:]]';
+------------------------------+
| 'hello' REGEXP '[[:digit:]]' |
+------------------------------+
| 0 |
+------------------------------+
1 row in set (0.00 sec)
MariaDB [test]> select 'hello' REGEXP '[0-9]';
+------------------------+
| 'hello' REGEXP '[0-9]' |
+------------------------+
| 0 |
+------------------------+
1 row in set (0.00 sec)
MariaDB [test]> select 'hello' REGEXP '[[:alpha:]]';
+------------------------------+
| 'hello' REGEXP '[[:alpha:]]' |
+------------------------------+
| 1 |
+------------------------------+
1 row in set (0.00 sec)
MariaDB [test]> select 'hello' REGEXP '[:alpha:]';
+----------------------------+
| 'hello' REGEXP '[:alpha:]' |
+----------------------------+
| 1 |
+----------------------------+
1 row in set (0.00 sec)