我正在编写一个 python 脚本来扫描 youtube 视频,以查找对视频发表评论的人的用户名,并将他们的用户名写入文件。
我正在使用 youtube API,当我打印comment_entry 的整个响应时,我能够获得评论作者。
有没有办法隔离用户名?
例如,输入 9bZkp7q19f0 (Gangnam Style) 作为 video_id 将产生(在第一条评论的值集中):
<?xml version='1.0' encoding='UTF-8'?>
<ns0:entry xmlns:ns0="http://www.w3.org/2005/Atom" xmlns:ns1="http://gdata.youtube.com/schemas/2007"><ns0:category scheme="http://schemas.google.com/g/2005#kind" term="http://gdata.youtube.com/schemas/2007#comment" /><ns0:id>http://gdata.youtube.com/feeds/api/videos/9bZkp7q19f0/comments/LZQPQhLyRh9VQaVtT18UUKqLpyWBdytJ7B-JRTu0cf8</ns0:id><ns0:author><ns0:name>THAsweatyGamer</ns0:name><ns0:uri>https://gdata.youtube.com/feeds/api/users/THAsweatyGamer</ns0:uri></ns0:author><ns0:content type="text">sometimes but not always</ns0:content><ns0:updated>2013-05-17T12:30:27.000Z</ns0:updated><ns0:published>2013-05-17T12:30:27.000Z</ns0:published><ns0:title type="text">sometimes but not ...</ns0:title><ns0:link href="https://gdata.youtube.com/feeds/api/videos/9bZkp7q19f0?client=TJNP_YT_BOT" rel="related" type="application/atom+xml" /><ns0:link href="https://www.youtube.com/watch?v=9bZkp7q19f0" rel="alternate" type="text/html" /><ns0:link href="https://gdata.youtube.com/feeds/api/videos/9bZkp7q19f0/comments/LZQPQhLyRh-b_np-G6TRfbDU8xlXaRcR_qXeRfla_vo?client=TJNP_YT_BOT" rel="http://gdata.youtube.com/schemas/2007#in-reply-to" type="application/atom+xml" /><ns0:link href="https://gdata.youtube.com/feeds/api/videos/9bZkp7q19f0/comments/LZQPQhLyRh9VQaVtT18UUKqLpyWBdytJ7B-JRTu0cf8?client=TJNP_YT_BOT" rel="self" type="application/atom+xml" /><ns1:videoid>9bZkp7q19f0</ns1:videoid></ns0:entry>
我想隔离<ns0:author><ns0:name>THAsweatyGamer</ns0:name><ns0:uri>https://gdata.youtube.com/feeds/api/users/THAsweatyGamer</ns0:uri></ns0:author>
将用户名写入文件。使用 comment_entry.author 产生:
[<atom.Author object at 0x02CE5B50>]
[<atom.Author object at 0x02CE5EB0>]
[<atom.Author object at 0x02CED230>]
[<atom.Author object at 0x02CED5B0>]
[<atom.Author object at 0x02CED910>]
[<atom.Author object at 0x02CEDCD0>]
[<atom.Author object at 0x02CF6070>]
[<atom.Author object at 0x02CF63D0>]
[<atom.Author object at 0x02CF6750>]
[<atom.Author object at 0x02CF6B10>]
[<atom.Author object at 0x02CF6E90>]
[<atom.Author object at 0x03591210>]
[<atom.Author object at 0x03591590>]
[<atom.Author object at 0x03591950>]
[<atom.Author object at 0x03591CD0>]
[<atom.Author object at 0x0359B050>]
[<atom.Author object at 0x0359B3D0>]
[<atom.Author object at 0x0359B750>]
[<atom.Author object at 0x0359BAD0>]
[<atom.Author object at 0x0359BE50>]
[<atom.Author object at 0x035A31D0>]
[<atom.Author object at 0x035A3530>]
[<atom.Author object at 0x035A3890>]
[<atom.Author object at 0x035A3BF0>]
我的脚本(到目前为止)是:
import gdata.youtube
import gdata.youtube.service
yt_service = gdata.youtube.service.YouTubeService()
yt_service.ssl = True
yt_service.developer_key = #mykey
yt_service.client_id = #myclientid
yt_service.source = #myclientid
video_id = raw_input("Enter the video's ID")
comment_feed = yt_service.GetYouTubeVideoCommentFeed(video_id= video_id)
for comment_entry in comment_feed.entry:
print comment_entry.author