2

我正在尝试从 twitter 下载推文。

为此,我使用了 python 和 Tweepy。虽然我对 Python 和 Twitter API 都很陌生。

我的 Python 脚本如下:#!usr/bin/python

#import modules
import sys
import tweepy
import json

#global variables
consumer_key = ''
consumer_secret = ''
token_key = ''
token_secret = ''

#Main function
def main():
    print sys.argv[0],'starts'
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(token_key, token_secret)
    print 'Connected to Twitter'
    api = tweepy.API(auth)
    if not api.test():
        print 'Twitter API test failed'

    print 'Experiment with cursor'
    print 'Get search method returns json objects'

   json_search = api.search(q="football")
   #json.loads(json_search())
   print  json_search


#Standard boilerplate to call main function if this file runs

if __name__ == '__main__':
    main()

我得到的结果如下:

[<tweepy.models.SearchResult object at 0x9a0934c>, <tweepy.models.SearchResult object at 0x9a0986c>, <tweepy.models.SearchResult object at 0x9a096ec>, <tweepy.models.SearchResult object at 0xb76d8ccc>, <tweepy.models.SearchResult object at 0x9a09ccc>, <tweepy.models.SearchResult object at 0x9a0974c>, <tweepy.models.SearchResult object at 0x9a0940c>, <tweepy.models.SearchResult object at 0x99fdfcc>, <tweepy.models.SearchResult object at 0x99fdfec>, <tweepy.models.SearchResult object at 0x9a08cec>, <tweepy.models.SearchResult object at 0x9a08f4c>, <tweepy.models.SearchResult object at 0x9a08eec>, <tweepy.models.SearchResult object at 0x9a08a4c>, <tweepy.models.SearchResult object at 0x9a08c0c>, <tweepy.models.SearchResult object at 0x9a08dcc>]

现在我很困惑如何从这些信息中提取推文?我尝试对这些数据使用 json.loads 方法。但它给了我错误,因为 JSON 需要字符串或缓冲区。示例代码将不胜感激。提前致谢。

4

4 回答 4

8

Tweepy 为您提供更丰富的对象;它为您解析了 JSON。

这些SearchResult对象与 Twitter 发送的 JSON 结构具有相同的属性;只需查看Tweet 文档以查看可用的内容:

for result in api.search(q="football"):
    print result.text

演示:

>>> import tweepy
>>> tweepy.__version__
'3.3.0'
>>> consumer_key = '<consumer_key>'
>>> consumer_secret = '<consumer_secret>'
>>> access_token = '<access_token>'
>>> access_token_secret = '<access_token_secret>'
>>> auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
>>> auth.set_access_token(access_token, access_token_secret)
>>> api = tweepy.API(auth)
>>> for result in api.search(q="football"):
...     print result.text
... 
Great moments from the Women's FA Cup http://t.co/Y4C0LFJed9
RT @freebets: 6 YEARS AGO TODAY: 

Football lost one of its great managers. 

RIP Sir Bobby Robson. http://t.co/NCo90ZIUPY
RT @Oddschanger: COMPETITION CLOSES TODAY!

Win a Premier League or Football League shirt of YOUR choice! 

RETWEET &amp; FOLLOW to enter. http…
Berita Transfer: Transfer rumours and paper review – Friday, July 31 http://t.co/qRrDIEP2zh [TS] #nobar #gosip
@ajperry18 im sorry I don't know this football shit
@risu_football おれモロ誕生日で北辰なんすよ笑
NFF Unveils Oliseh As Super Eagles Coach - SUNDAY Oliseh has been unveiled by the Nigeria Football... http://t.co/IOYajD9bi2 #Sports
RT @BilelGhazi: RT @lequipe : Gourcuff, au tour de Guingamp http://t.co/Dkio8v9LZq
@EDS_Amy HP SAUCE ?
RT @fsntweet: マンCの塩対応に怒りの炎!ベトナム人ファン、チケットを燃やして猛抗議 - http://t.co/yg5iuABy3K 

なめるなよ、プレミアリーグ!マンチェスターCのプレシーズンツアーの行き先でベトナム人男性が、衝撃的な行
RT @peterMwendo: Le football cest un sport collectif ou on doit se faire des passe http://t.co/61hy138yo8
RT @TSBible: 6 years ago today, football lost a true gentleman. Rest in Peace Sir Bobby Robson. http://t.co/6eHTI6UxaC
6 years ago today the greatest football manger of all time passed away SIR Bobby Robson a true Ipswich and footballing legend
The Guardian: PSG close to sealing £40m deal for Manchester United’s Ángel Di María. http://t.co/gAQEucRLZa
Sir Bobby Robson, the #football #legend passed away 6 years ago. 

#Barcelona #newcastle #Porto http://t.co/4UXpnvrHhS
于 2013-02-13T15:23:03.360 回答
1

您可以使用JSON 解析器来实现这一点,这是我在 App Engine 上的代码,它处理准备用于 JQuery 客户端的JSONP响应:

import webapp2
import tweepy
import json
from tweepy.parsers import JSONParser

class APISearchHandler(webapp2.RequestHandler):
    def get(self):

        CONSUMER_KEY = 'xxxx'
        CONSUMER_SECRET = 'xxxx'
        ACCESS_TOKEN_KEY = 'xxxx'
        ACCESS_TOKEN_SECRET = 'xxxx'

        auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
        auth.set_access_token(ACCESS_TOKEN_KEY, ACCESS_TOKEN_SECRET)
        api = tweepy.API(auth, parser=JSONParser())

        # Query String Parameters
        qs = self.request.get('q')
        max_id = self.request.get('max_id')

        # JSONP Callback
        callback = self.request.get('callback')

        max_tweets = 100
        search_results = api.search(q=qs, count=max_tweets, max_id=max_id)
        json_str = json.dumps( search_results )

        if callback:
            response = "%s(%s)" % (callback, json_str)
        else:
            response = json_str

        self.response.write( response )

所以关键是

api = tweepy.API(auth, parser=JSONParser())
于 2015-07-31T19:22:12.397 回答
1

我不使用全局变量,而是在 python 中重新组织代码class

import tweepy

class TweetPrinter():
    """
        Simple class to print tweets
    """
    def __init__(self, consumer_key, consumer_secret, access_token, 
                 access_token_secret):
        self.consumer_key = consumer_key
        self.consumer_secret = consumer_secret
        self.access_token = access_token
        self.access_token_secret = access_token_secret
        self.auth = tweepy.OAuthHandler(self.consumer_key, 
                                        self.consumer_secret)
        self.auth.set_access_token(access_token, access_token_secret)

    def tweet_print(self):
        api = tweepy.API(self.auth)
        football_tweets = api.search(q="football")
        for tweet in football_tweets:
            print(tweet.text)


def main():
    tweet_printer = TweetPrinter(my_consumer_key, my_consumer_secret, 
                              my_access_token, my_access_token_secret)

    tweet_printer.tweet_print()

if __name__ == '__main__':
    main()
于 2018-01-12T14:03:44.610 回答
0

将我的代码用于 tweepy:

def twitterfeed():
   auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
   auth.set_access_token(access_key, access_secret)
   api = tweepy.API(auth)
   statuses = tweepy.Cursor(api.home_timeline).items(20)
   data = [s.text.encode('utf8') for s in statuses]
   print data
于 2013-02-13T17:12:38.173 回答