1

我需要检查一些文本的模式(我必须检查我的模式是否在许多文本中)。

这是我的例子

String pattern = "^[a-zA-Z ]*toto win(\\W)*[a-zA-Z ]*$";    
if("toto win because of".matches(pattern))
 System.out.println("we have a winner");
else
 System.out.println("we DON'T have a winner");

对于我的测试,模式必须匹配,但使用我的正则表达式不匹配。必须匹配 :

" toto win bla bla"

"toto win because of"
"toto win. bla bla"


"here. toto win. bla bla"
"here? toto win. bla bla"

"here %dfddfd . toto win. bla bla"

不能匹配:

" -toto win bla bla"
" pretoto win bla bla"

我尝试使用我的正则表达式来做到这一点,但它不起作用。

你能指出我做错了什么吗?

4

5 回答 5

1

只需将您的代码更改为String pattern = "\\s*toto win[\\w\\s]*";

\W 表示无单词字符,\w 表示单词字符(a-zA-Z_0-9)。

[\\w\\s]*将匹配“toto win”之后的任意数量的单词和空格。

更新

为了反映您的新要求,此表达式将起作用:

"((.*\\s)+|^)toto win[\\w\\s\\p{Punct}]*"

((.*\\s)+|^)匹配后跟至少一个空格或行首的任何内容。

[\\w\\s\\p{Punct}]*匹配单词、数字、空格和标点符号的任意组合。

于 2012-06-12T09:07:44.847 回答
1

这会起作用

(?im)^[?.\s%a-z]*?\btoto win\b.+$

解释

"(?im)" +         // Match the remainder of the regex with the options: case insensitive (i); ^ and $ match at line breaks (m)
"^" +             // Assert position at the beginning of a line (at beginning of the string or after a line break character)
"[?.\\s%a-z]" +    // Match a single character present in the list below
                     // One of the characters “?.”
                     // A whitespace character (spaces, tabs, and line breaks)
                     // The character “%”
                     // A character in the range between “a” and “z”
   "*?" +            // Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
"\\b" +            // Assert position at a word boundary
"toto\\ win" +     // Match the characters “toto win” literally
"\\b" +            // Assert position at a word boundary
"." +             // Match any single character that is not a line break character
   "+" +             // Between one and unlimited times, as many times as possible, giving back as needed (greedy)
"$"               // Assert position at the end of a line (at the end of the string or before a line break character)

更新 1

(?im)^[?~`'!@#$%^&*+.\s%a-z]*? toto win\b.*$

更新 2

(?im)^[^-]*?\btoto win\b.*$

更新 3

(?im)^.*?(?<!-)toto win\b.*$

解释

"(?im)" +       // Match the remainder of the regex with the options: case insensitive (i); ^ and $ match at line breaks (m)
"^" +           // Assert position at the beginning of a line (at beginning of the string or after a line break character)
"." +           // Match any single character that is not a line break character
   "*?" +          // Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
"(?<!" +        // Assert that it is impossible to match the regex below with the match ending at this position (negative lookbehind)
   "-" +           // Match the character “-” literally
")" +
"toto\\ win" +   // Match the characters “toto win” literally
"\\b" +          // Assert position at a word boundary
"." +           // Match any single character that is not a line break character
   "*" +           // Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
"$"             // Assert position at the end of a line (at the end of the string or before a line break character)

RegEx 需要转义才能在代码中使用

于 2012-06-12T09:44:48.137 回答
0

您在模式中缺少win和下一个单词之间的空格

试试这个:\\stoto\\swin\\s\\w

http://gskinner.com/RegExr/在这里你可以试试你的正则表达式

于 2012-06-12T08:58:11.017 回答
0

以下正则表达式

^[a-zA-Z. ]*toto win[a-zA-Z. ]*$

将匹配

 toto win bla bla
toto win because of
toto win. bla bla

而且不匹配

-toto win bla bla"
于 2012-06-12T09:00:01.643 回答
0

如果您包含实际要求,而不是要(不)匹配的内容列表,则会更容易。我强烈怀疑“toto winabc”不应该匹配,但不确定,因为您没有包含此类示例或解释要求。无论如何,这适用于您当前的所有示例:

static String[] matchThese = new String[] {
        " toto win bla bla",
        "toto win because of",
        "toto win. bla bla",
        "here. toto win. bla bla",
        "here? toto win. bla bla",
        "here %dfddfd . toto win. bla bla"
};

static String[] dontMatchThese = new String[] {
        " -toto win bla bla",
        " pretoto win bla bla"
};


public static void main(String[] args) {
    // either beginning of a line or whitespace followed by "toto win"
    Pattern p = Pattern.compile("(^|\\s)toto win");

    System.out.println("Should match:");
    for (String s : matchThese) {
        System.out.println(p.matcher(s).find());
    }

    System.out.println("Shouldn't match:");
    for (String s : dontMatchThese) {
        System.out.println(p.matcher(s).find());
    }
}
于 2012-06-12T10:56:07.853 回答