字符类

 

如果浏览Pattern类规范会看到总结支持正则表达式结构的表。表13-1描述字符类。

左边一列指定正则表达式结构,右边一列描述每个结构在什么情况下匹配。

表13-1  字符类

[abc]

a、b或者c(简单类)

[^abc]

除a、b或者c之外的任何字符(非)

[a-zA-Z]

a到z,或者A到Z(包含)(范围)

[a-d[m-p]]

a到d,或者m到p:[a-dm-p](并)

[a-z&&[def]]

d、e或者f(交)

[a-z&&[^bc]]

a到z,除b和c之外:[ad-z](减)

[a-z&&[^m-p]]

a到z,不包括m到p:[a-lq-z](减)

注意 短语“字符类”中“类”这个词不表示.class文件。在正则表达式的上下文表述中,字符类是括在方括号内的字符集合。它表示这些字符将和给定输入字符串内的单一字符成功匹配。

简单类

字符类最基本的形式是方括号中简单并排放置的字符集合。例如,正则表达式[bcr]at将和单词“bat”、“cat”或者“rat”匹配,因为它定义一个字符类(接受“b”、“c”或者“r”)作为其第一个字符:

Enter your regex: [bcr]at

Enter input string to search: bat

I found the text "bat" starting at index 0 and ending at index 3.

Enter your regex: [bcr]at

Enter input string to search: cat

I found the text "cat" starting at index 0 and ending at index 3.

Enter your regex: [bcr]at

Enter input string to search: rat

I found the text "rat" starting at index 0 and ending at index 3.

Enter your regex: [bcr]at

Enter input string to search: hat

No match found.

在上面的例子中,只有当第一个字符和字符类定义的字符之一匹配时,整体匹配才成功。

1. 非

为了匹配列出的字符之外的所有字符,需要在字符类的开头插入“^”。这种技术被称为非(negation):

Enter your regex: [^bcr]at

Enter input string to search: bat

No match found.

Enter your regex: [^bcr]at

Enter input string to search: cat

No match found.

Enter your regex: [^bcr]at

Enter input string to search: rat

No match found.

Enter your regex: [^bcr]at

Enter input string to search: hat

I found the text "hat" starting at index 0 and ending at index 3.

只有当输入字符串的第一个字符不包含字符类中定义的任何字符时,匹配才成功。

2. 范围

有时候,你会希望定义一个字符类包含一个范围内的值,比如字母“a”到“h”或者数字“1”到“5”。为了指定范围,只需在要匹配的第一个和最后一个字符之间插入“-”即可,比如[1-5]或者[a-h]。也可以在类中连着放置不同范围,以便进一步扩展匹配的可能性。例如[a-zA-Z]将匹配字母表中的任何字母:a到z(小写)或者A到Z(大写)。

下面是范围和非的一些例子:

Enter your regex: [a-c]

Enter input string to search: a

I found the text "a" starting at index 0 and ending at index 1.

Enter your regex: [a-c]

Enter input string to search: b

I found the text "b" starting at index 0 and ending at index 1.

Enter your regex: [a-c]

Enter input string to search: c

I found the text "c" starting at index 0 and ending at index 1.

Enter your regex: [a-c]

Enter input string to search: d

No match found.

Enter your regex: foo[1-5]

Enter input string to search: foo1

I found the text "foo1" starting at index 0 and ending at index 4.

Enter your regex: foo[1-5]

Enter input string to search: foo5

I found the text "foo5" starting at index 0 and ending at index 4.

Enter your regex: foo[1-5]

Enter input string to search: foo6

No match found.

Enter your regex: foo[^1-5]

Enter input string to search: foo1

No match found.

Enter your regex: foo[^1-5]

Enter input string to search: foo6

I found the text "foo6" starting at index 0 and ending at index 4.

3. 并

也可以使用并(union)创建由两个或者多个独立字符类构成的单一字符类。为了创建并,只需在一个类中嵌套另一个类,比如[0-4[6-8]]。这个并创建的单一字符类匹配数字0、1、2、3、4、6、7和8。

Enter your regex: [0-4[6-8]]

Enter input string to search: 0

I found the text "0" starting at index 0 and ending at index 1.

Enter your regex: [0-4[6-8]]

Enter input string to search: 5

No match found.

Enter your regex: [0-4[6-8]]

Enter input string to search: 6

I found the text "6" starting at index 0 and ending at index 1.

Enter your regex: [0-4[6-8]]

Enter input string to search: 8

I found the text "8" starting at index 0 and ending at index 1.

 

Enter your regex: [0-4[6-8]]

Enter input string to search: 9

No match found.

4. 交

为了创建只和其所有嵌套类共有的字符匹配的单一字符类,需要使用&&,比如[0-9&&[345]]。这个交创建只和两个字符类共有的数字(3、4和5)匹配的单一字符类:

Enter your regex: [0-9&&[345]]

Enter input string to search: 3

I found the text "3" starting at index 0 and ending at index 1.

Enter your regex: [0-9&&[345]]

Enter input string to search: 4

I found the text "4" starting at index 0 and ending at index 1.

Enter your regex: [0-9&&[345]]

Enter input string to search: 5

I found the text "5" starting at index 0 and ending at index 1.

Enter your regex: [0-9&&[345]]

Enter input string to search: 2

No match found.

Enter your regex: [0-9&&[345]]

Enter input string to search: 6

No match found.

下面的例子显示两个范围的交:

Enter your regex: [2-8&&[4-6]]

Enter input string to search: 3

No match found.

Enter your regex: [2-8&&[4-6]]

Enter input string to search: 4

I found the text "4" starting at index 0 and ending at index 1.

Enter your regex: [2-8&&[4-6]]

Enter input string to search: 5

I found the text "5" starting at index 0 and ending at index 1.

Enter your regex: [2-8&&[4-6]]

Enter input string to search: 6

I found the text "6" starting at index 0 and ending at index 1.

Enter your regex: [2-8&&[4-6]]

Enter input string to search: 7

No match found.

5. 减

最后,可以使用减(subtraction)去掉一个或者多个嵌套字符类,比如[0-9&&[^345]]。这个例子创建从0到9的所有值,但除3、4和5之外的单一字符类:

Enter your regex: [0-9&&[^345]]

Enter input string to search: 2

I found the text "2" starting at index 0 and ending at index 1.

Enter your regex: [0-9&&[^345]]

Enter input string to search: 3

No match found.

Enter your regex: [0-9&&[^345]]

Enter input string to search: 4

No match found.

Enter your regex: [0-9&&[^345]]

Enter input string to search: 5

No match found.

Enter your regex: [0-9&&[^345]]

Enter input string to search: 6

I found the text "6" starting at index 0 and ending at index 1.

Enter your regex: [0-9&&[^345]]

Enter input string to search: 9

I found the text "9" starting at index 0 and ending at index 1.

现在我们介绍了如何创建字符类,在阅读下一小节之前你可能希望回顾一下表13-1。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值