0

我试图捕捉在实时推文流传输期间从对等方重置连接时引发的异常,但似乎 try-exception 块没有正确捕捉引发的错误并通过它。请告知,如果该块没有正确放置在代码中或代码有问题。

我创建了一个脚本,可以将推文实时流式传输到 excel 文件中。很多时候,由于 ECONNRESET 错误(这是由对等方重置连接)导致流式传输断开连接 -

Exception in thread Thread-1:
Traceback (most recent call last):
File “/usr/lib/python2.7/threading.py”, line 801, in __bootstrap_inner
self.run()
File “/usr/lib/python2.7/threading.py”, line 754, in run
self.__target(*self.__args, **self.__kwargs)
File “/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py”, line 297, in _run
six.reraise(*exc_info)
File “/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py”, line 266, in _run
self._read_loop(resp)
File “/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py”, line 316, in _read_loop
line = buf.read_line().strip()
File “/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py”, line 181, in read_line
self._buffer += self._stream.read(self._chunk_size)
File “/usr/local/lib/python2.7/dist-packages/urllib3/response.py”, line 430, in read
raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
File “/usr/lib/python2.7/contextlib.py”, line 35, in exit
self.gen.throw(type, value, traceback)
File “/usr/local/lib/python2.7/dist-packages/urllib3/response.py”, line 349, in _error_catcher
raise ProtocolError(‘Connection broken: %r’ % e, e)
ProtocolError: (‘Connection broken: error("(104, ‘ECONNRESET’)",)’, error("(104, ‘ECONNRESET’)",))

它是一个协议错误,我尝试通过导入 urllib3 库来捕获此错误,因为它具有协议异常,但 try 和异常块无法抑制它并继续流式传输。

  import pandas as pd
  import csv
  from bs4 import BeautifulSoup
  import re
  import tweepy
  import ast
  from datetime import datetime
  import time
  from tweepy import Stream
  from tweepy import OAuthHandler  
  from tweepy.streaming import StreamListener
  import json
  from unidecode import unidecode
  from urllib3.exceptions import ProtocolError
  from urllib3.exceptions import IncompleteRead
  import requests

  consumer_key= 'xxxxxxxxx'
  consumer_secret= 'xxxxxxxxx'
  access_token= 'xxxxxxxxx'
  access_token_secret= 'xxxxxxxxx'


  with open('TEST_FEB.csv','w')as f:
       f.truncate()
       f.close()

class listener(StreamListener):

    def on_data(self,data):
        data1 = json.loads(data)
        time = data1["created_at"]
        if hasattr(data1,"retweeted_status:"):
            tweet = unidecode(data1["tweet"]["text"])
        if data1["truncated"] == "true":
            tweet = unidecode(data1["extended_tweet"]["full_text"])
        else:
            tweet = unidecode(data1["text"])
        tweet1 = BeautifulSoup(tweet, "lxml").get_text()
        url = "https://twitter.com/{}/status/{}".format(data1["user"] 
               ["screen_name"], data1["id_str"])
        file = open('TEST_FEB.csv', 'a')
        csv_writer = csv.writer(file)
        csv_writer.writerow([time, tweet1, url])
        file.close()

    def on_limit(self, track):
        return True

auth = OAuthHandler(consumer_key,consumer_secret)
auth.set_access_token(access_token,access_token_secret)

while True:
      try:
          twitterStream = Stream(auth, listener(), 
          wait_on_rate_limit=True, retry_count=10, stall_warnings=True)
          twitterStream.filter(track=["abcd"], async = True)

       except ProtocolError as error:
             print (str(error))
             continue

       except IncompleteRead as IR:
              print (str(IR))
              continue

预期的结果是,每当从对等方重置连接并引发所述错误时,代码应该抑制它并继续流式传输。当前形式的代码不是这样工作的。

4

0 回答 0