1

到目前为止,我有一个正则表达式可以满足我的所有要求,突然间我得到了一个字符串,其中包含 C++ 中的 + 和 C# 中的 # 之类的保留字符。下面的代码适用于我所有的单词集合,除了 c++ 和 C#

MatchCollection matches= Regex.Matches(@"This  program is written in C# We'll delete it after ten days", @"\bC\+\+\b");
foreach(Match m in matches)
{
      Console.Write(m.Value);
}

任何人都可以指出原因吗?

4

4 回答 4

3

您应该\B在第二个边界上使用,而不是\b

MatchCollection matches= Regex.Matches(@"This  program is written in C# We'll delete it after ten days", @"\bC\#\B");

您可以阅读以下链接了解更多信息: http ://www.regular-expressions.info/wordboundaries.html

于 2013-12-03T13:51:20.933 回答
1

您可以使用以下模式,将匹配存储在组 1中:

图案

\bC(\+\+|\#)\s

而这个 C# 代码:

代码

MatchCollection matches= Regex.Matches(@"This  program is written in C# We'll delete it after ten days", @"\bC\+\+\b");

foreach(Match m in matches)
{
     Console.Write(m.Groups[1].Value);
}

输入

This  program is written in C# We'll delete it after ten days

输出

C#

输入

This  program is written in C++ We'll delete it after ten days

输出

C++
于 2013-12-03T14:05:40.697 回答
0

下面的代码适用于我所有的单词集合,除了 c++ 和 C#

为了使匹配起作用,您需要这样的正则表达式@"(?:C\+\+)|(?:C#)",这里有一个正则表达式 101 来证明它

于 2013-12-03T13:51:23.513 回答
0

在您的情况下,与其寻找单词边界( \b)或非单词边界( \B),不如考虑寻找空格( \s+)、首( ^) 和行尾( $)。

这是一个可以做到这一点的正则表达式:

(?:^|\s+)(C#|C\+\+)(?=\s+|$)

这是一个Perl程序,它在示例数据集上演示了该正则表达式。(另见现场演示。)

#!/usr/bin/perl -w

use strict;
use warnings;

while (<DATA>) {
    chomp;

#   A - Preceded by the beginning of the line or 1 or more whitespace
#       characters
#   B - The character sequences 'C#' or 'C++'
#   C - Followed by 1 or more whitespace characters or the end of line.

    if (/(?:^|\s+)(C#|C\+\+)(?=\s+|$)/) {
#           ^^^^^  ^^^^^^^^    ^^^^^
#             A        B         C

        print "[$1] [$_]\n";
    } else {
        print "[--] [$_]\n";
    }
}

__END__
This program is written in C++ We'll delete it after ten days
This program is written in !C++ We'll delete it after ten days
This program is written in C++! We'll delete it after ten days
This program is written in C# We'll delete it after ten days
C# is the language this program is written in.
 C# is the language this program is written in.
C++ is the language this program is written in.
This program is written in C#
This program is written in C++
This program is written in C++!

预期输出:

[C++] [This program is written in C++ We'll delete it after ten days]
[--] [This program is written in !C++ We'll delete it after ten days]
[--] [This program is written in C++! We'll delete it after ten days]
[C#] [This program is written in C# We'll delete it after ten days]
[C#] [C# is the language this program is written in.]
[C#] [ C# is the language this program is written in.]
[C++] [C++ is the language this program is written in.]
[C#] [This program is written in C#]
[C++] [This program is written in C++]
[--] [This program is written in C++!]
于 2013-12-03T15:28:05.487 回答