0

我一直在使用咖啡脚本中的 node.js 来从 reddit.com 的 json 界面中提取一些故事,但遇到了一些障碍。

我想从中解析json http://www.reddit.com/r/programming/.json,然后附加一个带有参数的查询语句countafter重新解析。根据传递给的参数重复get_stories()

当我运行下面的代码时,file.js > test.txt我得到了意想不到的结果。(见下文)看起来querystring.count正在更新,但它们都与最后一遍中的 url 相匹配。不知道为什么我没有看到 count=0,25,50,75,125。此外,querystring.after网址上不存在。到底是怎么回事?

代码:

# Requires
request = require 'request'
qs = require 'querystring'
mongojs = require 'mongojs'

# Connect to db
db = mongojs 'mongodb://localhost/feedtraining', ['subreddit_stories']

get_stories = (subreddit, {per_page, pages}, storyCallback) ->
    current_page = 0
    querystring = {}

    while true
        querystring.count = current_page * per_page

        request_uri = "http://www.reddit.com/r/#{subreddit}/.json?#{qs.stringify querystring}"

        request
            uri: request_uri,
            json: true,
            (error, response, body) ->
                if !error and response.statusCode == 200
                    for item in body.data.children
                        if item.data.selftext_html is null
                            storyCallback request_uri, current_page, item.data

                    querystring.after = body.data.children[body.data.children.length-1].id
                else
                    console.log error

                return

        if current_page == pages then break else current_page++

    return

get_stories 'programming', {per_page: 25, pages: 5}, (request_uri, page, story) ->
    db.subreddit_stories.insert(story)
    console.log request_uri

输出:

http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
http://www.reddit.com/r/programming/.json?count=125
4

1 回答 1

1

[编辑]

如果您需要链接异步操作,例如设置querystring.afternext request,您将无法使用while. 循环将运行到完成,request在其中任何一个完成之前启动所有 s 并且querystring.after可以设置。

您可以用 s 重写迭代function作为延续,这样每个请求都会等到after前一个请求可用。

旁注:由于after应该已经移动了集合的开头,您可能希望保持count相同的值。否则,集合的大小将随着每个请求而增长。

get_stories = (subreddit, {per_page, pages}, storyCallback) ->
    current_page = 0

    send_next_request = (querystring = {}) ->
        querystring.count = per_page

        request_uri = "http://www.reddit.com/r/#{subreddit}/.json?#{qs.stringify querystring}"

        request
            uri: request_uri,
            json: true,
            (error, response, body) ->
                if !error and response.statusCode == 200
                    for item in body.data.children
                        if item.data.selftext_html is null
                            storyCallback request_uri, current_page, item.data

                    current_page++
                    if current_page < pages
                        send_next_request(after: body.data.children[body.data.children.length-1].id)

    send_next_request()

[原创]

您需要围绕.request_uri

request_uri = "http://www.reddit.com/r/#{subreddit}/.json?#{qs.stringify querystring}"

do (request_uri) ->
  request
    url: request_uri,
    # ...

JavaScript 和 CoffeeScript 还没有()有块作用域,所以request_uri整个循环只创建 1 并且只能保留 1 个值。

添加request是异步的,while true循环将在之前完成:

storyCallback request_uri, current_page, item.data

对任何请求进行评估。并且,request_uri此时将始终具有循环中给出的最后一个值。

闭包创建了一个额外的function范围,因此每次迭代都while true可以有自己的request_uri.


这记录在Loops and Comprehensions下:

当使用 JavaScript 循环生成函数时,通常会插入一个闭包包装器,以确保循环变量是封闭的,并且所有生成的函数不只是共享最终值。CoffeeScript 提供了do关键字,它立即调用传递的函数,转发任何参数。

for filename in list
  do (filename) ->
    fs.readFile filename, (err, contents) ->
      compile filename, contents.toString()
于 2013-08-20T21:22:22.783 回答