java - 如何修复这个正则表达式？

Question

(W[AY]|C[AO])(\\s+\\d{5})

所以这目前解析以 W 或 C 开头的状态，后跟邮政编码。但是，它将所有这些作为一组返回，如示例 WA 98121 CA 56679 将返回组 1 是 WA 98121，组 2 是 CA 56679。

如何解决此问题以在 group1 WA group2 98121 group3 CA group4 56679 中检索

score 5 · Accepted Answer

通常，您希望在单个匹配中捕获搜索“短语”的所有部分，并通过使用组来分解短语的各个部分，处理匹配项，然后再次匹配下一个短语。因此，我将通过向您展示如何使用这种方法进行编码来回避您的问题。

下面是一些可运行的代码，演示了如何正确匹配和使用组：

// Regex to match a "state zip" sequence, and capture each part in its own group
String regex = "(W[AY]|C[AO])\\s+(\\d{5})";

// Some sample input
String input = "blah blah WA 98121 blah blah CA 56679 blah blah";

Matcher matcher = Pattern.compile(regex).getMatcher(input);
while (matcher.find()) { // move to next match, if one exists
    String state = matcher.group(1);
    String zip = matcher.group(2);
    // Work with state and zip values
    System.out.println("State = " + state + ", zip = " + zip);
}

输出：

State = WA, zip = 98121
State = CA, zip = 56679

请注意，捕获的正则表达式组的编号从1.
仅供参考，小组0是整个比赛。

java - 如何修复这个正则表达式？

1 回答 1

Related

Reference