假设 C#:
StringCollection resultList = new StringCollection();
Regex regexObj = new Regex("^.*<(?:/?b|/?em|/?su[pb]|/?[ou]l|/?li|span style=\"text-decoration: underline;\" data-mce-style=\"text-decoration: underline;\"|/span)>(?! ).*$", RegexOptions.Multiline);
Match matchResult = regexObj.Match(subjectString);
while (matchResult.Success) {
resultList.Add(matchResult.Value);
matchResult = matchResult.NextMatch();
}
将返回文件中的所有行,其中列表中的标签之一后至少有一个空格。
输入:
This </b> is <b> OK
This <b> is </b>not OK
Neither <b>is </b> this.
输出:
This <b> is </b>not OK
Neither <b>is </b> this.
解释:
^ # Start of line
.* # Match any number of characters except newlines
< # Match a <
(?: # Either match a...
/?b # b or /b
| # or
/?em # em or /em
|... # etc. etc.
) # End of alternation
> # Match a >
(?! ) # Assert that no space follows
.* # Match any number of characters until...
$ # End of line