//split
String s = "hello world i am 23 years";
String[] ss = s.split("\\s");
//replace
s.replaceAll("\\s", "#");
s.replaceFirst("\\d", "#");
//greedy(default) & reluctant
String greedy = "\\d*";
String lazy = "\\d*?";
//Pattern & Matcher
String phone = "my phone: 13352135478. Hers is 15984563215. call us later.";
Pattern p = Pattern.compile("\\d{11}");
Matcher m = p.matcher(phone);
while(m.find()) {
System.out.printf("find: %s, start: %d, end: %d\n", m.group(), m.start(), m.end());
}
//把整个input和regexp匹配, 类似Pattern.matches(regex, input);
boolean wholeMatch = Pattern.matches("\\d{4}-\\d{7}", "0592-6103687");
//groups
String tel = "my tel is 0592-6103625. call me at 12:00";
Pattern p2 = Pattern.compile("(\\d{4})-(\\d{7})");
Matcher m2 = p2.matcher(tel);
while(m2.find()) {
int count = m2.groupCount();
for(int i=0; i<=count; i++) {
System.out.printf("group: %s. start: %d, end: %d\n", m2.group(i), m2.start(i), m2.end(i));
}
}
//静态方法
Pattern.matches("\\d{4}", "1125f"); //false
Pattern.matches("\\d{4}", "1125"); //true
//scanner
Scanner scanner = new Scanner("xushaoxun@gmail.com mail me if you have time");
System.out.println(scanner.next("(\\w+)@(\\w+)\\.(\\w{3})"));
Regular Expression
Character Classes
[abc] | a, b, or c (simple class ) |
[^abc] | Any character except a, b, or c (negation ) |
[a-zA-Z] | a through z, or A through Z, inclusive (range ) |
[a-d[m-p]] | a through d, or m through p: [a-dm-p] (union ) |
[a-z&&[def]] | d, e, or f (intersection ) |
[a-z&&[^m-p]] | a through z, and not m through p: [a-lq-z] (subtraction ) |
Predefined Character Classes
Predefined Character Classes | |
. | Any character (may or may not match line terminators) |
\d | A digit: [0-9] |
\D | A non-digit: [^0-9] |
\s | A whitespace character: [ \t\n\x0B\f\r] |
\S | A non-whitespace character: [^\s] |
\w | A word character: [a-zA-Z_0-9] |
\W | A non-word character: [^\w] |
Quantifiers
Quantifiers | Meaning | ||
Greedy | Reluctant | Possessive | |
X? | X?? | X?+ | X , once or not at all |
X* | X*? | X*+ | X , zero or more times |
X+ | X+? | X++ | X , one or more times |
X{n} | X{n}? | X{n}+ | X , exactly n times |
X{n,} | X{n,}? | X{n,}+ | X , at least n times |
X{n,m} | X{n,m}? | X{n,m}+ | X , at least n but not more than m times |
Capturing Groups
In the expression ((A)(B(C))) , for example, there are four such groups:
- ((A)(B(C)))
- (A)
- (B(C))
- (C)
There is also a special group, group 0, which always represents the entire expression.
group function
public int start(int group)
public int end(int group)
public String group(int group)
Backreferences
The section of the input string matching the capturing group(s) is saved in memory for later recall via backreference . A backreference is specified in the regular expression as a backslash (\
) followed by a digit indicating the number of the group to be recalled. To match any 2 digits, followed by the exact same two digits, you would use (\d\d)\1
as the regular expression
Boundary Matchers
Boundary Matchers | |
^ | The beginning of a line |
$ | The end of a line |
\b | A word boundary |
\B | A non-word boundary |
\A | The beginning of the input |
\G | The end of the previous match |
\Z | The end of the input but for the final terminator, if any |
\z | The end of the input |
Pattern class
static method
Pattern.matches (String regex, CharSequence input) ;
Pattern.compile (String regex, int flags) ;
instance method
Matcher matcher = pattern.matcher( CharSequence input );
pattern.split( CharSequence input) ;
java.lang.String equivalence
str.matches(regex);
String[] str.split(regex);
str.replace(regex, replacement);
Matcher class
Index Methods
Index methods provide useful index values that show precisely where the match was found in the input string:
public int start()
public int start(int group)
public int end()
public int end(int group)
Study Methods
Study methods review the input string and return a boolean indicating whether or not the pattern is found.
public boolean lookingAt()
public boolean find()
public boolean find(int start)
public boolean matches()
Replacement Methods
Replacement methods are useful methods for replacing text in an input string.
public Matcher appendReplacement(StringBuffer sb, String replacement)
public StringBuffer appendTail(StringBuffer sb)
public String replaceAll(String replacement)
public String replaceFirst(String replacement)
public static String quoteReplacement(String s)
String
for the specifiedString
. This method produces aString
that will work as a literal replacements
in theappendReplacement
method of theMatcher
class. TheString
produced will match the sequence of characters ins
treated as a literal sequence. Slashes ('\'
) and dollar signs ('$'
) will be given no special meaning.
java.lang.String equivalence
str.replaceFirst(regex, replacement);
str.replaceAll(regex, replacement);
Some example:
- Use start() and end()
Pattern pat = Pattern.compile (regex );
Matcher mat = pat.matcher(input);
while (mat.find()) {
System. out .println(mat.start());
System. out .println(mat.end());
}
- Use matches() and lookAt()
String input = "ofoooooooooooo" ;
String regex = "foo" ;
Pattern pat = Pattern.compile (regex);
Matcher mat = pat.matcher(input);
System. out .println (mat.lookingAt()); //true
System. out .println (mat.matches()); //false
- Use appendReplacement() and appendTail()
Pattern p = Pattern.compile("cat");
Matcher m = p.matcher("a cat");
StringBuffer sb = new StringBuffer();
while (m.find()) {
m.appendReplacement(sb, "dog");
}
m.appendTail(sb);
System.out.println(sb); //just like “a cat”.replaceAll(“..”);