list - 适配 Craigslist Scraper Python

Question

我正在尝试调整我在网上找到的 python 2.7 craigslist scraper 以与 python 3.6 一起使用。

但是每次我运行 python 脚本时，它都不会返回任何东西。是因为我没有针对正确的 html 标签吗？如果是这样，我将如何定位正确的 html 标签？

我假设这是代码的这一部分：

    for listing in soup.find_all('p',{'class':'result-row'}):
    if listing.find('span',{'class':'result-price'}) != None:

完整的脚本如下。

先感谢您。

import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin

URL = 'https://vancouver.craigslist.ca/search/sss?query=Vespa'
BASE = 'https://vancouver.craigslist.ca/'

response = requests.get(URL)

soup = BeautifulSoup(response.content,"html.parser")
for listing in soup.find_all('p',{'class':'result-row'}):
    if listing.find('span',{'class':'result-price'}) != None:
        price = listing.text[2:6]
        price = int(price)
        if price <=3600 and price > 1000:
            print (listing.text)
            link_end = listing.a['href']
            url = urljoin(BASE, link_end)
            print (url)
            print ("\n")
print('test')

score 0 · Accepted Answer

你说得对，这是可能的问题：

 for listing in soup.find_all('p',{'class':'result-row'}):
    if listing.find('span',{'class':'result-price'}) != None:

必须针对您正在抓取的特定网页编辑此片段。您是否查看过页面的 HTML 并验证了这两行？如果没有，请右键单击页面并选择“查看页面源”。然后你必须找到你想要抓取的特定数据。

如果我想从 html 中看起来像这样的网页中抓取一些东西：

<div class='what'>hello</div>

我将上面的代码更改为：

for listing in soup.find_all('div',{'class':'what'}):
     # do something

list - 适配 Craigslist Scraper Python

1 回答 1

Related

Reference