python - Python webscraper 输出带有数字的括号

Question

我在 Windows 上运行 python 3.3。下面的代码转到 yahoo Finance 并拉出股票价格并打印出来。我遇到的问题是它输出：

['540.04']

我只想要这个数字，所以我可以把它变成一个浮点数并将它与公式一起使用。我尝试只使用 float 函数，但这不起作用。我想我必须用一些代码行以某种方式删除括号和撇号。

    from urllib.request import urlopen
    from bs4 import BeautifulSoup
    import re

    htmlfile = urlopen("http://finance.yahoo.com/q?s=AAPL&q1=1")

    Thefind = re.compile ('<span id="yfs_l84_aapl">(.+?)</span>')

    msg=htmlfile.read()

    price = Thefind.findall(str(msg))

    print (price)

score 0 · Accepted Answer

BeautifulSoup的美妙之处在于您不必使用正则表达式来解析 HTML 数据。

这是使用BS的正确方法：

from urllib.request import urlopen
from bs4 import BeautifulSoup

html = urlopen("http://finance.yahoo.com/q?s=AAPL&q1=1")
soup = BeautifulSoup(html)
my_span = soup.find('span', {'id': 'yfs_l84_aapl'})
print(my_span.text)

哪个产量

540.04

score 0 · Accepted Answer

函数 findall() 返回一个列表。如果您只想要第一组，请像这样选择它：

Thefind.findall(msg)[0]

但是像这样引用任何组都更干净：

Thefind.match(msg).group(1)

注：group(0)是全场比赛，不是第一组。

score -1 · Accepted Answer

-1

使用 Python 内置函数 float(price.strip("[']"))

于 2014-01-08T03:03:54.120 回答

python - Python webscraper 输出带有数字的括号

3 回答 3

Related

Reference