在您的情况下,与其寻找单词边界( \b
)或非单词边界( \B
),不如考虑寻找空格( \s+
)、行首( ^
) 和行尾( $
)。
这是一个可以做到这一点的正则表达式:
(?:^|\s+)(C#|C\+\+)(?=\s+|$)
这是一个Perl程序,它在示例数据集上演示了该正则表达式。(另见现场演示。)
#!/usr/bin/perl -w
use strict;
use warnings;
while (<DATA>) {
chomp;
# A - Preceded by the beginning of the line or 1 or more whitespace
# characters
# B - The character sequences 'C#' or 'C++'
# C - Followed by 1 or more whitespace characters or the end of line.
if (/(?:^|\s+)(C#|C\+\+)(?=\s+|$)/) {
# ^^^^^ ^^^^^^^^ ^^^^^
# A B C
print "[$1] [$_]\n";
} else {
print "[--] [$_]\n";
}
}
__END__
This program is written in C++ We'll delete it after ten days
This program is written in !C++ We'll delete it after ten days
This program is written in C++! We'll delete it after ten days
This program is written in C# We'll delete it after ten days
C# is the language this program is written in.
C# is the language this program is written in.
C++ is the language this program is written in.
This program is written in C#
This program is written in C++
This program is written in C++!
预期输出:
[C++] [This program is written in C++ We'll delete it after ten days]
[--] [This program is written in !C++ We'll delete it after ten days]
[--] [This program is written in C++! We'll delete it after ten days]
[C#] [This program is written in C# We'll delete it after ten days]
[C#] [C# is the language this program is written in.]
[C#] [ C# is the language this program is written in.]
[C++] [C++ is the language this program is written in.]
[C#] [This program is written in C#]
[C++] [This program is written in C++]
[--] [This program is written in C++!]