我正在将网页上的数据抓取到文本文件中,因为我想删除一些不相关的内容,例如
</h3>
<div class="form clearfix">
<a href="/matches/2012/11/11/mexico/primera-division/club-san-luis/deportivo-toluca-futbol-club/1292713/" class="form-icon form-loss " title="San Luis - Toluca 0 - 2">L</a>
<a href="/matches/2012/11/04/mexico/primera-division/club-tijuana-xoloitzcuintles-de-caliente/club-san-luis/1292699/" class="form-icon form-draw " title="Tijuana - San Luis 0 - 0">D</a>
<a href="/matches/2012/10/28/mexico/primera-division/club-san-luis/queretaro-fc/1292695/" class="form-icon form-draw " title="San Luis - Querétaro 0 - 0">D</a>
<a href="/matches/2012/10/21/mexico/primera-division/club-atlas-de-guadalajara/club-san-luis/1292684/" class="form-icon form-win " title="Atlas - San Luis 2 - 3">W</a>
<a href="/matches/2012/10/14/mexico/primera-division/club-san-luis/club-atlante/1292674/" class="form-icon form-draw last" title="San Luis - Atlante 2 - 2">D</a>
</div>
</div>
<div class="container middle">
<h3 class="thick scoretime ">
我正在尝试将输出作为</h3><h3 class="thick scoretime ">
删除其他数据。
我试过了
source = regax.replace(source, </h3>.*<h3 class="thick scoretime "> ","</h3><h3 class="thick scoretime "> ")
但它没有用。谁能指出我正确的方向?