Code Resource:https://github.com/MoreYoungGavin/02100211_Study_RegularExpression
Expression Parts
( starts a group of atoms
) ends a group of atoms
(?= starts a group that’s a positive look-ahead
(?! Starts a group that’s a negative look-ahead
(?<= starts a group that’s a positive look-behind
(?<! starts a group that’s a negative look-behind
) ends any of the previous groups
(?<…> the start of a named group,where … is substituted with the name
) the end of the named group
Qualifiers
? means “zero or one”,which matches the preceding expression found zero or one time
+ means “one or more”,An expression using the + qualifier will match the previous expression one or more time,making it required but matching it as many times as possible
* means “zero or more”,You should use this qualifier carefully;since it matches zero occurrences of the preceding expression,some unexpected results can occur
Ranges
{ specifies the beginning of a range
} specifies the end of a range
{n} specifies the preceding expression is found exactly n times
{n,} specifies the preceding expression is found at least n times
{n,m} specifies the preceding expression is found at least n but no more than m times
Line Anchors
^ specifies the beginning of the line
$ specifies the end of the line
An Escape
\ indicates the escape character
Saying “Or”
| indicates “or”
Character Classes
[ indicates the beginning of a character class
- indicates a range inside a character class(unless it’s first in the class)
^ indicates a negated character class,if found first
] indicates the end of a character class
\d this matches any digit such as 0-9
\D this matches any character that isn’t a digit,such as punctuation and letters A-Z and a-z
\p{…} this matches any character that’s in the Unicode group name supplied inside the braces
\P{…} this matches any character that’s in the Unicode group name supplied inside the braces
\s this matches any whitespace,such as spaces,tabs,or returns
\S this matches any nonwhitespace
\un this matches any Unicode character where n is the Unicode character expressed in hexadecimal notation
\w this matches any word character.Word characters in English are 0-9,A-Z,a-z and _.
\W this matches any nonword character
Matching Anything
. indicates any characterss