python - BS4 从名字奇怪的班级获取信息

Question

<span class=\"normal_price\">$2.69 USD<\/span>

如何用bs4提取数据？这不起作用：

soup.find("span", attrs={"class": "\"normal_price\""})

score 1 · Accepted Answer

您将 HTML 嵌入到 JSON 字符串中，该字符串必须转义引号。与其手动提取该数据，不如先解析 JSON：

import json

data = json.loads(json_data)
html = data['results_html']

如果您正在使用该requests库，则可以为您解码响应：

response = requests.get('http://steamcommunity.com/market/search/render/?query=appid:730&start=0&count=3&currency=3&l=english&cc=pt')
html = response.json()['results_html']

之后你可以用 BeautifulSoup 解析它就好了：

>>> import requests
>>> from bs4 import BeautifulSoup
>>> html = requests.get('http://steamcommunity.com/market/search/render/?query=appid:730&start=0&count=3&currency=3&l=english&cc=pt').json()['results_html']
>>> BeautifulSoup(html, 'lxml').find('span', class_='normal_price').span
<span class="normal_price">$2.69 USD</span>

python - BS4 从名字奇怪的班级获取信息

1 回答 1

Related

Reference