我试图捕捉在实时推文流传输期间从对等方重置连接时引发的异常,但似乎 try-exception 块没有正确捕捉引发的错误并通过它。请告知,如果该块没有正确放置在代码中或代码有问题。
我创建了一个脚本,可以将推文实时流式传输到 excel 文件中。很多时候,由于 ECONNRESET 错误(这是由对等方重置连接)导致流式传输断开连接 -
Exception in thread Thread-1:
Traceback (most recent call last):
File “/usr/lib/python2.7/threading.py”, line 801, in __bootstrap_inner
self.run()
File “/usr/lib/python2.7/threading.py”, line 754, in run
self.__target(*self.__args, **self.__kwargs)
File “/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py”, line 297, in _run
six.reraise(*exc_info)
File “/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py”, line 266, in _run
self._read_loop(resp)
File “/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py”, line 316, in _read_loop
line = buf.read_line().strip()
File “/usr/local/lib/python2.7/dist-packages/tweepy/streaming.py”, line 181, in read_line
self._buffer += self._stream.read(self._chunk_size)
File “/usr/local/lib/python2.7/dist-packages/urllib3/response.py”, line 430, in read
raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
File “/usr/lib/python2.7/contextlib.py”, line 35, in exit
self.gen.throw(type, value, traceback)
File “/usr/local/lib/python2.7/dist-packages/urllib3/response.py”, line 349, in _error_catcher
raise ProtocolError(‘Connection broken: %r’ % e, e)
ProtocolError: (‘Connection broken: error("(104, ‘ECONNRESET’)",)’, error("(104, ‘ECONNRESET’)",))
它是一个协议错误,我尝试通过导入 urllib3 库来捕获此错误,因为它具有协议异常,但 try 和异常块无法抑制它并继续流式传输。
import pandas as pd
import csv
from bs4 import BeautifulSoup
import re
import tweepy
import ast
from datetime import datetime
import time
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import json
from unidecode import unidecode
from urllib3.exceptions import ProtocolError
from urllib3.exceptions import IncompleteRead
import requests
consumer_key= 'xxxxxxxxx'
consumer_secret= 'xxxxxxxxx'
access_token= 'xxxxxxxxx'
access_token_secret= 'xxxxxxxxx'
with open('TEST_FEB.csv','w')as f:
f.truncate()
f.close()
class listener(StreamListener):
def on_data(self,data):
data1 = json.loads(data)
time = data1["created_at"]
if hasattr(data1,"retweeted_status:"):
tweet = unidecode(data1["tweet"]["text"])
if data1["truncated"] == "true":
tweet = unidecode(data1["extended_tweet"]["full_text"])
else:
tweet = unidecode(data1["text"])
tweet1 = BeautifulSoup(tweet, "lxml").get_text()
url = "https://twitter.com/{}/status/{}".format(data1["user"]
["screen_name"], data1["id_str"])
file = open('TEST_FEB.csv', 'a')
csv_writer = csv.writer(file)
csv_writer.writerow([time, tweet1, url])
file.close()
def on_limit(self, track):
return True
auth = OAuthHandler(consumer_key,consumer_secret)
auth.set_access_token(access_token,access_token_secret)
while True:
try:
twitterStream = Stream(auth, listener(),
wait_on_rate_limit=True, retry_count=10, stall_warnings=True)
twitterStream.filter(track=["abcd"], async = True)
except ProtocolError as error:
print (str(error))
continue
except IncompleteRead as IR:
print (str(IR))
continue
预期的结果是,每当从对等方重置连接并引发所述错误时,代码应该抑制它并继续流式传输。当前形式的代码不是这样工作的。