4

我正在使用praw reddit 库从 reddit 中提取数据,我遇到了这段代码,我不明白为什么它会返回任何数据(在 BaseReddit 类中(完整源代码):

def get_content(self, page_url, limit=0, url_data=None, place_holder=None,
                root_field='data', thing_field='children',
                after_field='after'):
    """A generator method to return reddit content from a URL. Starts at
    the initial page_url, and fetches content using the `after` JSON data
    until `limit` entries have been fetched, or the `place_holder` has been
    reached.

    :param page_url: the url to start fetching content from
    :param limit: the maximum number of content entries to fetch. If
        limit <= 0, fetch the default_content_limit for the site. If None,
        then fetch unlimited entries--this would be used in conjunction
        with the place_holder param.
    :param url_data: dictionary containing extra GET data to put in the url
    :param place_holder: if not None, the method will fetch `limit`
        content, stopping if it finds content with `id` equal to
        `place_holder`.
    :param data_field: indicates the field in the json response that holds
        the data. Most objects use 'data', however some (flairlist) don't
        have the 'data' object. Use None for the root object.
    :param thing_field: indicates the field under the data_field which
        contains the list of things. Most objects use 'children'.
    :param after_field: indicates the field which holds the after item
        element
    :type place_holder: a string corresponding to a reddit content id, e.g.
        't3_asdfasdf'
    :returns: a list of reddit content, of type Subreddit, Comment,
        Submission or user flair.
    """
    content_found = 0

    if url_data is None:
        url_data = {}
    if limit is None:
        fetch_all = True
    elif limit <= 0:
        fetch_all = False
        limit = int(self.config.default_content_limit)
    else:
        fetch_all = False

    # While we still need to fetch more content to reach our limit, do so.
    while fetch_all or content_found < limit:
        page_data = self.request_json(page_url, url_data=url_data)
        if root_field:
            root = page_data[root_field]
        else:
            root = page_data
        for thing in root[thing_field]:
            yield thing
            content_found += 1
            # Terminate when we reached the limit, or place holder
            if (content_found == limit or
                place_holder and thing.id == place_holder):
                return
        # Set/update the 'after' parameter for the next iteration
        if after_field in root and root[after_field]:
            url_data['after'] = root[after_field]
        else:
            return

在我看来,所有 return 语句都没有参数,因此默认为返回None。谁可以给我解释一下这个?

注意:代码是 Python 2.x

4

2 回答 2

6

它是一个发电机。请参阅yield声明以获取提示。

http://wiki.python.org/moin/Generators

于 2012-07-22T22:10:10.630 回答
3

这是一个生成器函数,你可以通过yield语句来判断。该值实际上是“返回”的,而实际上并未从函数中返回。当从函数请求另一个值时,生成器从它产生的点恢复(根据下面的代码,继续for thing循环......)。

for thing in root[thing_field]:
    yield thing

简单的例子:

def blah():
    for i in xrange(5):
        yield i + 3

numbers = blah()
print next(numbers)
# lots of other code here...
# now we need the next value
print next(numbers)
于 2012-07-22T22:13:11.273 回答