我试图为 wordreference.com 编写一个网络爬虫,代码是这样的:
import bs4,sys, requests
from urllib import request
def scraper():
url=f"https://wordreference.com/iten/{sys.argv[1]}"
r=requests.get(url)
html=r.text
soup=bs4.BeautifulSoup(html, "html.parser")
x=soup.find_all("td","ToWrd")
print(x[1].contents)
if __name__ == "__main__":
scraper()
代码是正确的,但输出是这样的:输入:
python3 scraper.py cane
输出:
['dog ', <em class="tooltip POS2">n<span><i>noun</i>: Refers to person, place, thing, quality, etc.</span></em>]]
我只想打印数组的第一部分,什么时候包含“狗”。我该如何解决这个问题?