2

我正在尝试提取分隔符内的字符串出现(在这种情况下是括号),但不是引号内的字符串(单引号或双引号)。这是我尝试过的 - 这个正则表达式获取括号内的所有出现,以及引号内的那些(我不想要引号内的那些)

public class RegexMain {
    static final String PATTERN = "\\(([^)]+)\\)";
    static final Pattern CONTENT = Pattern.compile(PATTERN);
    /**
     * @param args
     */
    public static void main(String[] args) {
        String testString = "Rhyme (Jack) and (Jill) went up the hill on \"(Peter's)\" request.";
        Matcher match = CONTENT.matcher(testString);
        while(match.find()) {
            System.out.println(match.group()); // prints Jack, Jill and Peter's
        }
    }
}
4

3 回答 3

1

你可以试试

public class RegexMain {
    static final String PATTERN = "\\(([^)]+)\\)|\"[^\"]*\"";
    static final Pattern CONTENT = Pattern.compile(PATTERN);
    /**
     * @param args
     */
    public static void main(String[] args) {
        String testString = "Rhyme (Jack) and (Jill) went up the hill on \"(Peter's)\" request.";
        Matcher match = CONTENT.matcher(testString);
        while(match.find()) {
            if(match.group(1) != null) {
                System.out.println(match.group(1)); // prints Jack, Jill
            }
        }
    }
}

此模式将匹配带引号的字符串以及带括号的字符串,但只有带括号的字符串才会在group(1). 由于+*在正则表达式中是贪婪的,它会更喜欢匹配"(Peter's)".(Peter's)

于 2013-01-03T15:56:20.437 回答
1

在这种情况下,您可以优雅地使用后瞻和前瞻运算符来实现您想要的。这是 Python 中的一个解决方案(我总是用它在命令行上快速尝试一些东西),但是 Java 代码中的正则表达式应该是相同的。

此正则表达式匹配前面使用正向后视的左括号和使用正向超前的右括号的内容。但是,当左括号前面是一个使用负后瞻的单引号或双引号,以及当右括号后面是一个使用负前瞻的单引号或双引号时,它会避免这些匹配。

In [1]: import re

In [2]: s = "Rhyme (Jack) and (Jill) went up the hill on \"(Peter's)\" request."

In [3]: re.findall(r"""
   ...:     (?<=               # start of positive look-behind
   ...:         (?<!           # start of negative look-behind
   ...:             [\"\']     # avoids matching opening parenthesis preceded by single or double quote
   ...:         )              # end of negative look-behind
   ...:         \(             # matches opening parenthesis
   ...:     )                  # end of positive look-behind
   ...:     \w+ (?: \'\w* )?   # matches whatever your content looks like (configure this yourself)             
   ...:     (?=                # start of positive look-ahead
   ...:         \)             # matches closing parenthesis 
   ...:         (?!            # start of negative look-ahead
   ...:             [\"\']     # avoids matching closing parenthesis succeeded by single or double quote
   ...:         )              # end of negative look-ahead  
   ...:     )                  # end of positive look-ahead
   ...:     """, 
   ...:     s, 
   ...:     flags=re.X)
Out[3]: ['Jack', 'Jill']
于 2013-01-03T19:06:40.637 回答
0

注意:这不是最终回复,因为我对JAVA不熟悉,但我相信它仍然可以转换成JAVA语言。

就我而言,最简单的方法是将字符串中的引号部分替换为空字符串,然后查找匹配项。希望您对 PHP 有点熟悉,这就是这个想法。

$str = "Rhyme (Jack) and (Jill) went up the hill on \" (Peter's)\" request.";

preg_match_all(
    $pat = '~(?<=\().*?(?=\))~',
    // anything inside parentheses
    preg_replace('~([\'"]).*?\1~','',$str),
    // this replaces quoted strings with ''
    $matches
    // and assigns the result into this variable
);
print_r($matches[0]);
// $matches[0] returns the matches in preg_match_all

// [0] => Jack
// [1] => Jill
于 2013-01-03T14:32:19.267 回答