0

我不小心断开了我的互联网连接并在下面收到此错误。但是,为什么这条线会触发错误?

    self.content += tuple(subreddit_posts)

或者我应该问,为什么以下行没有导致 a sys.exit?它似乎应该捕获所有错误:

    try:
        subreddit_posts = self.r.get_content(url, limit=10)
    except:
        print '*** Could not connect to Reddit.'
        sys.exit()

这是否意味着我无意中两次访问了 reddit 的网络?

仅供参考,praw 是一个 reddit API 客户端。并get_content()获取 subreddit 的帖子/提交作为生成器对象。

错误信息:

Traceback (most recent call last):
  File "beam.py", line 49, in <module>
    main()
  File "beam.py", line 44, in main
    scan.scanNSFW()
  File "beam.py", line 37, in scanNSFW
    map(self.getSub, self.nsfw)
  File "beam.py", line 26, in getSub
    self.content += tuple(subreddit_posts)
  File "/Library/Python/2.7/site-packages/praw/__init__.py", line 504, in get_co
    page_data = self.request_json(url, params=params)
  File "/Library/Python/2.7/site-packages/praw/decorators.py", line 163, in wrap
    return_value = function(reddit_session, *args, **kwargs)
  File "/Library/Python/2.7/site-packages/praw/__init__.py", line 557, in reques
    retry_on_error=retry_on_error)
  File "/Library/Python/2.7/site-packages/praw/__init__.py", line 399, in _reque
    _raise_response_exceptions(response)
  File "/Library/Python/2.7/site-packages/praw/internal.py", line 178, in _raise
    response.raise_for_status()
  File "/Library/Python/2.7/site-packages/requests/models.py", line 831, in rais
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 503 Server Error: Service Unavailable

脚本(很短):

import sys, os, pprint, praw

class Scanner(object):
    ''' A scanner object. '''
    def __init__(self):
        self.user_agent = 'debian.22990.myapp'
        self.r = praw.Reddit(user_agent=self.user_agent)
        self.nsfw = ('funny', 'nsfw')
        self.nsfw_posters = set()
        self.content = ()

    def getSub(self, subreddit):
        ''' Accepts a subreddit. Connects to subreddit and retrieves content.
        Unpacks generator object containing content into tuple. '''
        url = 'http://www.reddit.com/r/{sub}/'.format(sub=subreddit)
        print 'Scanning:', subreddit
        try:
            subreddit_posts = self.r.get_content(url, limit=10)
        except:
            print '*** Could not connect to Reddit.'
            sys.exit()
        print 'Constructing list.',
        self.content += tuple(subreddit_posts)
        print 'Done.'

    def addNSFWPoster(self, post):
        print 'Parsing author and adding to posters.'
        self.nsfw_posters.add(str(post.author))

    def scanNSFW(self):
        ''' Scans all NSFW subreddits. Makes list of posters.'''
#       Get content from all nsfw subreddits
        print 'Executing map function.'
        map(self.getSub, self.nsfw)
#       Scan content and get authors
        print 'Executing list comprehension.'
        [self.addNSFWPoster(post) for post in self.content]

def main():
    scan = Scanner()
    scan.scanNSFW()
    for i in scan.nsfw_posters:
        print i
    print len(scan.content)

main()
4

1 回答 1

2

看起来praw会懒惰地获取对象,所以当你实际使用 subreddit_posts时是在发出请求时,这就解释了为什么它在那条线上爆炸了。

见:https ://praw.readthedocs.org/en/v2.1.20/pages/lazy-loading.html

于 2015-03-09T20:03:05.117 回答