4

我想匹配一个 html 代码,直到下次出现 ... 或结束。

目前我有以下正则表达式:

(<font color=\"#777777\">\.\.\. .+?<\/font>)

这将匹配:

1. <font color="#777777">... </font><font color="#000000">lives up to the customer's expectations. The subscriber is </font>
2. <font color="#777777">... You may not want them to be </font>
3. <font color="#777777">... </font><font color="#000000">the web link, and </font>

但我想要:

1. <font color="#777777">... </font><font color="#000000">lives up to the customer's expectations. The subscriber is </font><font color="#777777">obviously thinking about your merchandise </font><font color="#000000">in case they have clicked about the link in your email.</font>
2. <font color="#777777">... You may not want them to be </font><font color="#000000">disappointed by simply clicking </font>
3. <font color="#777777">... </font><font color="#000000">the web link, and </font><font color="#777777">finding </font><font color="#000000">the page to </font><font color="#777777">get other than </font><font color="#000000">what they thought it </font><font color="#777777">will be.. If America makes</font>

这是我要解析的html:

<font color="#777777">... </font><font color="#000000">lives up to the customer's expectations. The subscriber is </font><font color="#777777">obviously thinking about your merchandise  </font><font color="#000000">in case they have clicked about the link in your email.</font><font color="#777777">... You may not want them to be </font><font color="#000000">disappointed by simply clicking </font><font color="#777777">... </font><font color="#000000">the web link, and </font><font color="#777777">finding  </font><font color="#000000">the page to </font><font color="#777777">get other than  </font><font color="#000000">what they thought it </font><font color="#777777">will be.. If America makes</font>

和演示: http ://rubular.com/r/mmQ4TBZb96

如何匹配以 ... ... 开头的所有文本以获得上述所需的匹配项?

感谢帮助!

4

2 回答 2

2

即使你的问题似乎不一致(我不明白你为什么会得到最终想要的匹配),我认为这就是你所追求的:

((<font color=\"#777777\">\.{3}) .+?(<\/font>(?=\s*\2)|$))

它使用前瞻来使匹配结束就在下一个“...”序列之前(或输入结束。

在rubular上看到这个

于 2013-07-04T12:19:06.993 回答
0

问题是关于正则表达式的,但您也可以通过以下方式进行操作(Perl 语法,但我相信这种函数也存在于其他语言中):

split(/(?=<font color=\"#777777\">\.\.\.)/, $your_text)
于 2013-07-04T13:40:26.517 回答