jakarta regexp (java struts正则表达式)

本文介绍了正则表达式的各种元素,包括字符类、预定义类、边界匹配器等,并详细解释了贪婪与不情愿闭包的区别。此外还列举了逻辑运算符及反向引用等内容。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

   Character Classes

      [abc]                  Simple character class
      [a-zA-Z]               Character class with ranges
      [^abc]                 Negated character class
NOTE: Incomplete ranges will be interpreted as "starts from zero" or "ends with last character".
I.e. [-a] is the same as [//u0000-a], and [a-] is the same as [a-//uFFFF], [-] means "all characters".

  Standard POSIX Character Classes

      [:alnum:]              Alphanumeric characters.
      [:alpha:]              Alphabetic characters.
      [:blank:]              Space and tab characters.
      [:cntrl:]              Control characters.
      [:digit:]              Numeric characters.
      [:graph:]              Characters that are printable and are also visible.
                           (A space is printable, but not visible, while an
                           `a' is both.)
      [:lower:]              Lower-case alphabetic characters.
      [:print:]              Printable characters (characters that are not
                           control characters.)
      [:punct:]              Punctuation characters (characters that are not letter,
                           digits, control characters, or space characters).
      [:space:]              Space characters (such as space, tab, and formfeed,
                           to name a few).
      [:upper:]              Upper-case alphabetic characters.
      [:xdigit:]             Characters that are hexadecimal digits.

  Non-standard POSIX-style Character Classes

      [:javastart:]          Start of a Java identifier
      [:javapart:]           Part of a Java identifier

  Predefined Classes

      .           Matches any character other than newline
      /w          Matches a "word" character (alphanumeric plus "_")
      /W          Matches a non-word character
      /s          Matches a whitespace character
      /S          Matches a non-whitespace character
      /d          Matches a digit character
      /D          Matches a non-digit character

  Boundary Matchers

      ^           Matches only at the beginning of a line
      $           Matches only at the end of a line
      /b          Matches only at a word boundary
      /B          Matches only at a non-word boundary

  Greedy Closures

      A*          Matches A 0 or more times (greedy)
      A+          Matches A 1 or more times (greedy)
      A?          Matches A 1 or 0 times (greedy)
      A{n}        Matches A exactly n times (greedy)
      A{n,}       Matches A at least n times (greedy)
      A{n,m}      Matches A at least n but not more than m times (greedy)

  Reluctant Closures

      A*?         Matches A 0 or more times (reluctant)
      A+?         Matches A 1 or more times (reluctant)
      A??         Matches A 0 or 1 times (reluctant)

  Logical Operators

      AB          Matches A followed by B
      A|B         Matches either A or B
      (A)         Used for subexpression grouping
     (?:A)        Used for subexpression clustering (just like grouping but
                no backrefs)

  Backreferences

      /1      Backreference to 1st parenthesized subexpression
      /2      Backreference to 2nd parenthesized subexpression
      /3      Backreference to 3rd parenthesized subexpression
      /4      Backreference to 4th parenthesized subexpression
      /5      Backreference to 5th parenthesized subexpression
      /6      Backreference to 6th parenthesized subexpression
      /7      Backreference to 7th parenthesized subexpression
      /8      Backreference to 8th parenthesized subexpression
      /9      Backreference to 9th parenthesized subexpression
 

All closure operators (+, *, ?, {m,n}) are greedy by default, meaning that they match as many elements of the string as possible without causing the overall match to fail. If you want a closure to be reluctant (non-greedy), you can simply follow it with a '?'. A reluctant closure will match as few elements of the string as possible when finding matches. {m,n} closures don't currently support reluctancy.

Line terminators
A line terminator is a one- or two-character sequence that marks the end of a line of the input character sequence. The following are recognized as line terminators:

  • A newline (line feed) character ('/n'),
  • A carriage-return character followed immediately by a newline character ("/r/n"),
  • A standalone carriage-return character ('/r'),
  • A next-line character ('?'),
  • A line-separator character ('?'), or
  • A paragraph-separator character ('?).

更多,请访问:

http://jakarta.apache.org/regexp/apidocs/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值