1

我试图在以下方面刮取基金的价格:

http://www.prudential.com.hk/PruServlet?module=fund& purpose=searchHistFund&fundCd=JAS_U

但是表中行的类属性不同,有“class”:“fundPriceCell1”和“fundPriceCell2”:

<tr>
<td align="center" class="fundPriceCell1">08/11/2013</td><td align="center" class="fundPriceCell1">118.2500</td><td align="center" class="fundPriceCell1">118.2500</td>
</tr>
<tr>
<td align="center" class="fundPriceCell2">07/11/2013</td><td align="center" class="fundPriceCell2">118.9800</td><td align="center" class="fundPriceCell2">118.9800</td>
</tr>

如何刮桌子?这是错误的,但如何解决它?

import pandas as pd
import requests
url = 'http://www.prudential.com.hk/PruServlet?module=fund&purpose=searchHistFund&fundCd=JAS_U'
tables = pd.read_html(requests.get(url).text, attrs={"class":"fundPriceCell1"})
4

1 回答 1

1

我认为您可以传递已编译的正则表达式,并且此语法将匹配两个class属性:

import re
tables = pd.read_html(requests.get(url).text, attrs={"class":re.compile("fundPriceCell\d+")})
于 2013-11-13T12:00:41.320 回答