regexp 简单运用

        //split
        String s = "hello world i am 23 years";
        String[] ss = s.split("\\s");

        //replace
        s.replaceAll("\\s", "#");
        s.replaceFirst("\\d", "#");
        
        //greedy(default) & reluctant
        String greedy = "\\d*";
        String lazy = "\\d*?";       

 

        //Pattern & Matcher
        String phone = "my phone: 13352135478. Hers is 15984563215. call us later.";
        Pattern p = Pattern.compile("\\d{11}");
        Matcher m = p.matcher(phone);
        while(m.find()) {
            System.out.printf("find: %s, start: %d, end: %d\n", m.group(), m.start(), m.end());
        }
       
        //把整个input和regexp匹配, 类似Pattern.matches(regex, input);
       boolean wholeMatch = Pattern.matches("\\d{4}-\\d{7}", "0592-6103687");
       
        //groups
        String tel = "my tel is 0592-6103625. call me at 12:00";
        Pattern p2 = Pattern.compile("(\\d{4})-(\\d{7})");
        Matcher m2 = p2.matcher(tel);
        while(m2.find()) {
            int count = m2.groupCount();
            for(int i=0; i<=count; i++) {
                System.out.printf("group: %s. start: %d, end: %d\n", m2.group(i), m2.start(i), m2.end(i));
            }
        }
       
        //静态方法
        Pattern.matches("\\d{4}", "1125f");     //false

        Pattern.matches("\\d{4}", "1125");     //true

 

        //scanner
        Scanner scanner = new Scanner("xushaoxun@gmail.com mail me if you have time");

        System.out.println(scanner.next("(\\w+)@(\\w+)\\.(\\w{3})"));

 

Regular Expression

Character Classes

Character Classes

[abc]

a, b, or c (simple class )

[^abc]

Any character except a, b, or c (negation )

[a-zA-Z]

a through z, or A through Z, inclusive (range )

[a-d[m-p]]

a through d, or m through p: [a-dm-p] (union )

[a-z&&[def]]

d, e, or f (intersection )

[a-z&&[^m-p]]

a through z, and not m through p: [a-lq-z] (subtraction )

 

Predefined Character Classes

Predefined Character Classes

.

Any character (may or may not match line terminators)

\d

A digit: [0-9]

\D

A non-digit: [^0-9]

\s

A whitespace character: [ \t\n\x0B\f\r]

\S

A non-whitespace character: [^\s]

\w

A word character: [a-zA-Z_0-9]

\W

A non-word character: [^\w]

 

Quantifiers

 Quantifiers

 Meaning

 Greedy

 Reluctant

 Possessive

  X?

  X??

  X?+

  X , once or not at all

  X*

  X*?

  X*+

  X , zero or more times

  X+

  X+?

  X++

  X , one or more times

  X{n}

  X{n}?

  X{n}+

  X , exactly n times

  X{n,}

  X{n,}?

  X{n,}+

  X , at least n times

  X{n,m}

  X{n,m}?

  X{n,m}+

  X , at least n but not more than m times

 

Capturing Groups

In the expression ((A)(B(C))) , for example, there are four such groups:

  1. ((A)(B(C)))
  2. (A)
  3. (B(C))
  4. (C)

There is also a special group, group 0, which always represents the entire expression.

group function

public int start(int group)

public int end(int group)

public String group(int group)

Backreferences

The section of the input string matching the capturing group(s) is saved in memory for later recall via backreference . A backreference is specified in the regular expression as a backslash (\ ) followed by a digit indicating the number of the group to be recalled. To match any 2 digits, followed by the exact same two digits, you would use (\d\d)\1 as the regular expression

 

Boundary Matchers

 Boundary Matchers

  ^

 The beginning of a line

  $

 The end of a line

  \b

 A word boundary

  \B

 A non-word boundary

  \A

 The beginning of the input

  \G

 The end of the previous match

  \Z

 The end of the input but for the final terminator, if any

  \z

 The end of the input

 

Pattern class

static method

Pattern.matches (String regex, CharSequence input) ;

Pattern.compile (String regex, int flags) ;

 

instance method

Matcher matcher = pattern.matcher( CharSequence input );

pattern.split( CharSequence input) ;

 

java.lang.String equivalence

str.matches(regex);

String[] str.split(regex);

str.replace(regex, replacement);

 

Matcher class

Index Methods

Index methods provide useful index values that show precisely where the match was found in the input string:

  • public int start() : Returns the start index of the previous match.
  • public int start(int group) : Returns the start index of the subsequence captured by the given group during the previous match operation.
  • public int end() : Returns the offset after the last character matched.
  • public int end(int group) : Returns the offset after the last character of the subsequence captured by the given group during the previous match operation.
Study Methods

Study methods review the input string and return a boolean indicating whether or not the pattern is found.

  • public boolean lookingAt() : Attempts to match the input sequence, starting at the beginning of the region, against the pattern.
  • public boolean find() : Attempts to find the next subsequence of the input sequence that matches the pattern.
  • public boolean find(int start) : Resets this matcher and then attempts to find the next subsequence of the input sequence that matches the pattern, starting at the specified index.
  • public boolean matches() : Attempts to match the entire region against the pattern.
Replacement Methods

Replacement methods are useful methods for replacing text in an input string.

  • public Matcher appendReplacement(StringBuffer sb, String replacement) : Implements a non-terminal append-and-replace step.
  • public StringBuffer appendTail(StringBuffer sb) : Implements a terminal append-and-replace step.
  • public String replaceAll(String replacement) : Replaces every subsequence of the input sequence that matches the pattern with the given replacement string.
  • public String replaceFirst(String replacement) : Replaces the first subsequence of the input sequence that matches the pattern with the given replacement string.
  • public static String quoteReplacement(String s) : Returns a literal replacement String for the specified String . This method produces a String that will work as a literal replacement s in the appendReplacement method of the Matcher class. The String produced will match the sequence of characters in s treated as a literal sequence. Slashes ('\' ) and dollar signs ('$' ) will be given no special meaning.

java.lang.String equivalence

str.replaceFirst(regex, replacement);

str.replaceAll(regex, replacement);

 

Some example:

  • Use start() and end()

Pattern pat = Pattern.compile (regex );

Matcher mat = pat.matcher(input);        

while (mat.find()) {

System. out .println(mat.start());

     System. out .println(mat.end());                 

}

  • Use matches() and lookAt()

            String input = "ofoooooooooooo" ;

            String regex = "foo" ;

            Pattern pat = Pattern.compile (regex);

            Matcher mat = pat.matcher(input);

           

            System. out .println (mat.lookingAt());      //true

            System. out .println (mat.matches());        //false

  • Use appendReplacement() and   appendTail()

Pattern p = Pattern.compile("cat");

Matcher m = p.matcher("a cat");

StringBuffer sb = new StringBuffer();

while (m.find()) {

     m.appendReplacement(sb, "dog");

}

m.appendTail(sb);

System.out.println(sb);       //just like “a cat”.replaceAll(“..”);

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值