By default, a Scanner splits input tokens along whitespace, but we can also specify our own delimiter pattern in the form of a regular expression.
example 1:
// strings/ScannerDelimiter.java
// (c)2017 MindView LLC: see Copyright.txt
// We make no guarantees that this code is fit for any purpose.
// Visit http://OnJava8.com for more book information.
import java.util.*;
public class ScannerDelimiter {
public static void main(String[] args) {
Scanner scanner = new Scanner("12, 42, 78, 99, 42");
scanner.useDelimiter("\\s*,\\s*");
while (scanner.hasNextInt()) {
System.out.println(scanner.nextInt());
}
}
}
/* Output:
12
42
78
99
42
*/
example 2:
This example reads several items in from a string:
String input = "1 fish 2 fish red fish blue fish"; Scanner s = new Scanner(input).useDelimiter("\\s*fish\\s*"); System.out.println(s.nextInt()); System.out.println(s.nextInt()); System.out.println(s.next()); System.out.println(s.next()); s.close();
prints the following output:
1 2 red blue
The same output can be generated with this code, which uses a regular expression to parse all four tokens at once:
String input = "1 fish 2 fish red fish blue fish"; Scanner s = new Scanner(input); s.findInLine("(\\d+) fish (\\d+) fish (\\w+) fish (\\w+)"); MatchResult result = s.match(); for (int i=1; i<=result.groupCount(); i++) { System.out.println(result.group(i)); } s.close();
The default whitespace delimiter used by a scanner is as recognized by Character.isWhitespace. The reset() method will reset the value of the scanner's delimiter to the default whitespace delimiter regardless of whether it was previously changed.
example 3:
// strings/ThreatAnalyzer.java
// (c)2017 MindView LLC: see Copyright.txt
// We make no guarantees that this code is fit for any purpose.
// Visit http://OnJava8.com for more book information.
import java.util.*;
import java.util.regex.*;
public class ThreatAnalyzer {
static String threatData =
"58.27.82.161@08/10/2015\n"
+ "204.45.234.40@08/11/2015\n"
+ "58.27.82.161@08/11/2015\n"
+ "58.27.82.161@08/12/2015\n"
+ "58.27.82.161@08/12/2015\n"
+ "[Next log section with different data format]";
public static void main(String[] args) {
Scanner scanner = new Scanner(threatData);
String pattern = "(\\d+[.]\\d+[.]\\d+[.]\\d+)@" + "(\\d{2}/\\d{2}/\\d{4})";
while (scanner.hasNext(pattern)) {
scanner.next(pattern);
MatchResult match = scanner.match();
String ip = match.group(1);
String date = match.group(2);
System.out.format("Threat on %s from %s%n", date, ip);
}
}
}
/* Output:
Threat on 08/10/2015 from 58.27.82.161
Threat on 08/11/2015 from 204.45.234.40
Threat on 08/11/2015 from 58.27.82.161
Threat on 08/12/2015 from 58.27.82.161
Threat on 08/12/2015 from 58.27.82.161
*/
Regular expression
| POSIX | Non-standard | Perl/Tcl | Vim | Java | ASCII | Description |
|---|---|---|---|---|---|---|
[:ascii:][29] | \p{ASCII} | [\x00-\x7F] | ASCII characters | |||
[:alnum:] | \p{Alnum} | [A-Za-z0-9] | Alphanumeric characters | |||
[:word:][29] | \w | \w | \w | [A-Za-z0-9_] | Alphanumeric characters plus "_" | |
\W | \W | \W | [^A-Za-z0-9_] | Non-word characters | ||
[:alpha:] | \a | \p{Alpha} | [A-Za-z] | Alphabetic characters | ||
[:blank:] | \s | \p{Blank} | [ \t] | Space and tab |
references:
1. On Java 8 - Bruce Eckel
2. https://github.com/wangbingfeng/OnJava8-Examples/blob/master/strings/ScannerDelimiter.java
3. https://docs.oracle.com/javase/8/docs/api/java/util/Scanner.html
4. https://github.com/wangbingfeng/OnJava8-Examples/blob/master/strings/ThreatAnalyzer.java
博客介绍了Java中Scanner默认按空白分割输入标记,也可通过正则表达式指定分隔符。给出多个示例展示如何从字符串读取内容,还提到默认空白分隔符及重置方法,最后列出相关参考资料。
841

被折叠的 条评论
为什么被折叠?



