php - 使用python利用网页功能

Question

我试图了解这个网站是如何运作的。有一个输入表单，您可以在其中提供 url。此表单返回从另一个站点 (Youtube) 检索到的信息。所以：

我的第一个也是更有趣的问题是，是否有人知道该站点如何检索整个语句库？

或者，因为现在我使用以下代码：

from BeautifulSoup import BeautifulSoup
import json

urlstr = 'http://www.sandracires.com/en/client/youtube/comments.php?v=' + videoId + '&page=' + str(npage)
url = urllib2.urlopen(urlstr)
content = url.read()
soup = BeautifulSoup(content)
#parse json
newDictionary=json.loads(str(soup)) 

#print example
print newDictionary['list'][1]['username']

但是，我无法在所有页面中进行迭代（当我手动进行时不会发生这种情况）。我放在timer.sleep(30)json下面但没有成功。为什么会这样？

谢谢！

^{Python 2.7.8}

score 0 · Accepted Answer

可能使用Google Youtube 数据 API。请注意，（目前）只能使用 API 的第 2 版检索评论 - 该 API 已被弃用。显然 V3 中尚不支持。Python 客户端库可用，请参阅https://developers.google.com/youtube/code#Python。

响应已经是 JSON，不需要 BS。Web 服务器似乎需要 cookie，所以我建议使用requests module，尤其是它的会话管理：

import requests

videoId = 'ZSzeFFsKEt4'
results = []
npage = 1
session = requests.session()
while True:
    urlstr = 'http://www.sandracires.com/en/client/youtube/comments.php'
    print "Getting page ", npage
    response = session.get(urlstr, params={'v': videoId, 'page': npage})
    content = response.json()
    if len(content['list']) > 1:
        results.append(content)
    else:
        break
    npage += 1

print results

php - 使用python利用网页功能

1 回答 1

Related

Reference