1

下面的代码正在流式传输 twitter 公共时间线,以获取将任何推文输出到控制台的变量。我想将相同的变量(status.text、status.author.screen_name、status.created_at、status.source)保存到 sqlite 数据库中。当我的脚本看到一条推文并且没有任何内容写入 sqlite 数据库时,我收到了一个语法错误。

错误:

$ python stream-v5.py @lunchboxhq
Filtering the public timeline for "@lunchboxhq"RT @LunchboxHQ: test 2   LunchboxHQ  2012-02-29 18:03:42 Echofon
Encountered Exception: near "?": syntax error

编码:

import sys
import tweepy
import webbrowser
import sqlite3 as lite

# Query terms

Q = sys.argv[1:]

sqlite3file='/var/www/twitter.lbox.com/html/stream5_log.sqlite'

CONSUMER_KEY = ''
CONSUMER_SECRET = ''
ACCESS_TOKEN = ''
ACCESS_TOKEN_SECRET = ''

auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)

con = lite.connect(sqlite3file)
cur = con.cursor()
cur.execute("CREATE TABLE TWEETS(txt text, author text, created int, source text)")

class CustomStreamListener(tweepy.StreamListener):

    def on_status(self, status):

        try:
            print "%s\t%s\t%s\t%s" % (status.text, 
                                      status.author.screen_name, 
                                      status.created_at, 
                                      status.source,)

            cur.executemany("INSERT INTO TWEETS(?, ?, ?)", (status.text, 
                                                            status.author.screen_name, 
                                                            status.created_at, 
                                                            status.source))

        except Exception, e:
            print >> sys.stderr, 'Encountered Exception:', e
            pass

    def on_error(self, status_code):
        print >> sys.stderr, 'Encountered error with status code:', status_code
        return True # Don't kill the stream

    def on_timeout(self):
        print >> sys.stderr, 'Timeout...'
        return True # Don't kill the stream

streaming_api = tweepy.streaming.Stream(auth, CustomStreamListener(), timeout=60)

print >> sys.stderr, 'Filtering the public timeline for "%s"' % (' '.join(sys.argv[1:]),)

streaming_api.filter(follow=None, track=Q)
4

4 回答 4

2

您在以下代码的最后一行缺少右括号(您发布的第 34-37 行):

            cur.executemany("INSERT INTO TWEETS(?, ?, ?)", (status.text, 
                                                        status.author.screen_name, 
                                                        status.created_at, 
                                                        status.source)

只需在您的元组参数之后添加一个括号即可立即关闭方法调用。

于 2012-02-27T05:42:33.227 回答
2
import sqlite3 as lite
con = lite.connect('test.db')
cur = con.cursor()   

cur.execute("CREATE TABLE TWEETS(txt text, author text, created int, source text)")

然后稍后:

cur.executemany("INSERT INTO TWEETS(?, ?, ?, ?)", (status.text, 
                                      status.author.screen_name, 
                                      status.created_at, 
                                      status.source))
于 2012-02-24T17:20:11.640 回答
0

全面披露:对这些东西还是陌生的。但是,我通过将您的代码更改为:

cur.execute("INSERT INTO TWEETS VALUES(?,?,?,?)", (status.text, status.author.screen_name, status.created_at, status.source))
con.commit()

在我看来,您一次阅读一种状态。executemany 方法适用于您拥有多个状态的情况。例如:

(['sometext', 'bob','2013-02-01','Twitter for Android'], ['someothertext', 'helga', '2013-01-31', 'MacSomething'])

我绝对不是一个向导,并且不确定 commit() 对每个条目有什么样的影响......我猜性能很糟糕,但它适用于查询中的单个术语。

感谢您发布您的代码,我终于学会了如何进行流式传输。

于 2013-02-01T09:00:15.560 回答
0

我对 tweepy 很陌生。但这些是对我有用的修改。您需要在 INSERT INTO TWEETS 之后添加 VALUES 。另外,不要忘记提交更改。这是我提到的链接:相关帖子

     cur.execute("INSERT INTO TWEETS VALUES(?, ?, ?, ?)", (status.text, 
                                                        status.author.screen_name, 
                                                        status.created_at, 
                                                        status.source))

     con.commit()
于 2014-01-27T19:38:51.407 回答