3

我刚刚进入 Java 中的正则表达式,我正在阅读一本书和 Java 文档 - 我不知道为什么 - 鉴于以下程序 - 这"[\\s*]"不等同于"\\s*"用作分隔符时。这似乎"[\\s*]"相当于"\\s+",有人可以从逻辑上告诉我为什么会这样吗?

import java.util.Scanner;
import java.util.regex.Pattern;
public class ScanString {
    public static void main(String[] args) {
        String str = "Smith , where Jones had had 'had', had had 'had had'.";
        String regex = "had";
        System.out.println("String is:\n" + str + "\nToken sought is " + regex);

        Pattern had = Pattern.compile(regex);
        Scanner strScan = new Scanner(str);
        strScan.useDelimiter("\\s*");
        int hadCount = 0;
        while(strScan.hasNext()) {
            if(strScan.hasNext(had)) {
                ++hadCount;
                System.out.println("Token found!: " + strScan.next(had));

            } else {
                System.out.println("Token is    : " + strScan.next());
            }
        }
        System.out.println("Count is: " + hadCount);
    }
}

对我来说有意义的输出是每个非空白字符作为单独的标记。当分隔符更改为"\\s+"or"[\\s*]"时,输出为:

String is:
Smith , where Jones had had 'had', had had 'had had'.
Token sought is had
Token is    : Smith
Token is    : ,
Token is    : where
Token is    : Jones
Token found!: had
Token found!: had
Token is    : 'had',
Token found!: had
Token found!: had
Token is    : 'had
Token is    : had'.
Count is: 4

4

2 回答 2

4

方括号[]包含一个字符类。在它们内部,关于特殊字符的规则是不同的。唯一的特殊字符是“右括号 ( ])、反斜杠 ( \)、插入符号 ( ^) 和连字符 ( -)”。(取自本页

所以在这种情况下[\\s*]意味着“空格或*”。

在处理正则表达式时,可以使用RegexPlanet(测试代码)或Regexper(以图形方式可视化正则表达式)等网站。

于 2013-03-10T02:34:50.283 回答
1

[]字符类。看看这些例子:[abc]意味着a|b|c. 如果您创建类似的东西[a*]将意味着a|\\*a*字符)。

于 2013-03-10T02:35:45.360 回答