python - Python + 机械化异步任务

Question

所以我有一段 Python 代码，它在一个美味的页面中运行，并从中删除了一些链接。extract 方法包含一些提取所需内容的魔法。但是，一个接一个地运行页面获取速度非常慢 - 有没有办法在 python 中执行此异步操作，以便我可以启动多个获取请求并并行处理页面？

url= "http://www.delicious.com/search?p=varun"
page = br.open(url)
html = page.read()
soup = BeautifulSoup(html)
extract(soup)

count=1
#Follows regexp match onto consecutive pages
while soup.find ('a', attrs={'class': 'pn next'}):
    print "yay"
    print count
    endOfPage = "false"
    try :
        page3 = br.follow_link(text_regex="Next")
        html3 = page3.read()
        soup3 = BeautifulSoup(html3)
        extract(soup3)
    except:
        print "End of Pages"
        endOfPage = "true"
    if valval == "true":
        break
    count = count +1

score 1 · Accepted Answer

Beautiful Soup 非常慢，如果您想要更好的性能，请改用 lxml，或者如果您有很多 CPU，也许您可以尝试使用多处理和队列。

python - Python + 机械化异步任务

1 回答 1

Related

Reference