我已经阅读了这篇文章,为什么不对 HTML 使用正则表达式。作为交给我的任务的一部分,我别无选择,只能对 HTML 使用正则表达式。
我有 HTML 代码并单独尝试过
<td class="a-nowrap">
<span class="a-letter-space"></span><span>13</span>
</td>
我已经能够使用以下正则表达式获得13 :
<td class="a-nowrap">\s*<span class="a-letter-space"></span><span>(\d*)</span>\s*</td>
同样来自
<td class="a-nowrap">
<a class="a-link-normal" title="69% of reviews have 5 stars" href="">5 star</a><span class="a-letter-space"></span>
</td>
使用正则表达式获得 5星
<td class="a-nowrap">\s*<a class="a-link-normal" [^>]*>\s*(.*)</a>\s*</td>
但是当两个 HTML 代码组合在一起时,
<table id="histogramTable" class="a-normal a-align-middle a-spacing-base">
<tr class="a-histogram-row">
<td class="a-nowrap">
<a class="a-link-normal" title="69% of reviews have 5 stars" href="">5 star</a><span class="a-letter-space"></span>
</td>
<td class="a-span10">
<a class="a-link-normal" title="69% of reviews have 5 stars" href=""><div class="a-meter"><div class="a-meter-bar" style="width: 69.1358024691358%;"></div></div></a>
</td>
<td class="a-nowrap">
<span class="a-letter-space"></span><span>13</span>
</td>
</tr>
<td class="a-nowrap">
<a class="a-link-normal" title="2% of reviews have 1 stars" href="">1 star</a><span class="a-letter-space"></span>
</td>
<td class="a-span10">
<a class="a-link-normal" title="2% of reviews have 1 stars" href=""><div class="a-meter"><div class="a-meter-bar" style="width: 2.46913580246914%;"></div></div></a>
</td>
<td class="a-nowrap">
<span class="a-letter-space"></span><span>2</span>
</td>
</table>
如何 使用正则表达式提取5 星和 13 ?