描述
在这个正则表达式中使用前瞻,它将捕获包含大提琴和莉莲的完整句子。
(?:(?<=\.)\s+|^)((?=(?:(?!\.(?:\s|$)).)*?\b[Cc]ello(?=\s|\.|$))(?=(?:(?!\.(?:\s|$)).)*?\b[Ll]illian(?=\s|\.|$)).*?\.(?=\s|$))
表达式被分解为这些功能组件:
(?:(?<=\.)\s+|^)
在 a 之后开始匹配这个句子,.
后跟任意数量的空格或字符串的开头
(
开始捕获组 1,它将捕获整个句子
(?=
开始展望
(?:(?!\.(?:\s|$)).)*?
确保正则表达式引擎不会通过强制它确认 a.
后跟空格或字符串结尾来保留此句子
\b
匹配单词break
[Cc]ello
匹配所需文本全部小写或大写首字母
(?=\s|\.|$)
向前看以确保字符串有一个尾随空格.
,或字符串的结尾
)
展望结束
(?=(?:(?!\.(?:\s|$)).)*?\b[Ll]illian(?=\s|\.|$))
这基本上是一样的,但对于 Lillian
.*?\.(?=\s|$)
捕获句子的其余部分,包括句点,并确保句点后跟空格或字符串的结尾
)
句末捕获组 1
代码示例
我不太了解python,所以我提供了一个PHP示例。请注意,在 match 语句中,我使用了s
允许.
表达式匹配换行符的选项
输入文本
Cello is a yellow parakeet who sings with Lillian. Toby is a clown who doesn't sing. Willy is a Wonka. Cello is a yellow Lillian.
Cello likes Lillian and kittens.
Lillian likes Cello and dogs. Cello has no friends. And Lillian also hasn't met anyone.
代码
<?php
$sourcestring="your source string";
preg_match_all('/(?:(?<=\.)\s+|^)((?=(?:(?!\.(?:\s|$)).)*?\b[Cc]ello(?=\s|\.|$))(?=(?:(?!\.(?:\s|$)).)*?\b[Ll]illian(?=\s|\.|$)).*?\.(?=\s|$))/s',$sourcestring,$matches);
echo "<pre>".print_r($matches,true);
?>
火柴
$matches Array:
(
[0] => Array
(
[0] => Cello is a yellow parakeet who sings with Lillian.
[1] => Cello is a yellow Lillian.
[2] =>
Cello likes Lillian and kittens.
[3] =>
Lillian likes Cello and dogs.
)
[1] => Array
(
[0] => Cello is a yellow parakeet who sings with Lillian.
[1] => Cello is a yellow Lillian.
[2] => Cello likes Lillian and kittens.
[3] => Lillian likes Cello and dogs.
)
)
如果您绝对需要匹配字符串 Cello 出现在 Lillian 之前的句子,那么您可以使用这样的表达式。在这里,我只是移动了一个右括号。
(?:(?<=\.)\s+|^)((?=(?:(?!\.(?:\s|$)).)*?\b[Cc]ello(?=\s|\.|$)(?=(?:(?!\.(?:\s|$)).)*?\b[Ll]illian(?=\s|\.|$))).*?\.(?=\s|$))
输入文本
Cello is a yellow parakeet who sings with Lillian. Toby is a clown who doesn't sing. Willy is a Wonka. Cello is a yellow Lillian.
Cello likes Lillian and kittens.
Lillian likes Cello and dogs. Cello has no friends. And Lillian also hasn't met anyone.
捕获组 1 的输出
[1] => Array
(
[0] => Cello is a yellow parakeet who sings with Lillian.
[1] => Cello is a yellow Lillian.
[2] => Cello likes Lillian and kittens.
)