7

我正在尝试使用请求来创建一种从 Twitter 的用户流中消费的强大方式。到目前为止,我已经制作了以下基本工作示例:

"""
Example of connecting to the Twitter user stream using Requests.
"""

import sys

import json

import requests

from oauth_hook import OAuthHook

def userstream(access_token, access_token_secret, consumer_key, consumer_secret):
    oauth_hook = OAuthHook(access_token=access_token, access_token_secret=access_token_secret, 
                           consumer_key=consumer_key, consumer_secret=consumer_secret, 
                           header_auth=True)

    hooks = dict(pre_request=oauth_hook)
    config = dict(verbose=sys.stderr)
    client = requests.session(hooks=hooks, config=config)

    data = dict(delimited="length")
    r = client.post("https://userstream.twitter.com/2/user.json", data=data, prefetch=False)

    # TODO detect disconnection somehow
    # https://github.com/kennethreitz/requests/pull/200/files#L13R169
    # Use a timeout? http://pguides.net/python-tutorial/python-timeout-a-function/
    for chunk in r.iter_lines(chunk_size=1):
        if chunk and not chunk.isdigit():
            yield json.loads(chunk)

if __name__ == "__main__":
    import pprint
    import settings
    for obj in userstream(access_token=settings.ACCESS_TOKEN, access_token_secret=settings.ACCESS_TOKEN_SECRET, consumer_key=settings.CONSUMER_KEY, consumer_secret=settings.CONSUMER_SECRET):
        pprint.pprint(obj)

但是,我需要能够优雅地处理断开连接。目前,当流断开连接时,上面只是挂起,并且没有引发异常。

实现这一目标的最佳方法是什么?有没有办法通过 urllib3 连接池检测到这个?我应该使用超时吗?

4

1 回答 1

0

我建议在 client.post() 调用中添加一个超时参数。http://docs.python-requests.org/en/latest/user/quickstart/#timeouts

但是,重要的是要注意请求不会设置 TCP 超时,因此您可以使用以下设置:

import socket
socket.setdefaulttimeout(TIMEOUT)
于 2014-06-23T17:43:24.733 回答