5

我正在学习 node、express、mongo 以及在此过程中的 javascript。我正在尝试获得一个使用 rssparser 的功能,获取故事列表并将它们保存到带有 mongoose 的 mongo 数据库中。

我已经让 RSS 拉动工作,并且我正在遍历故事,这是我遇到问题的保存。我想 1) 检查数据库中是否不存在该故事,2) 如果不存在,则保存它。我想我迷失了处理回调的方式。这是我当前的代码,带有注释。

rssparser.parseURL(url, options, function(err,out){
    // out.items is an array of the items pulled
    var items = out.items;
    var story;
    for (var i=0; i<items.length; i++){

        //create a mongoose story
        story = new schemas.Stories({
            title: items[i].title,
            url: items[i].url,
            summary: items[i].summary,
            published: items[i].published_at
        });

        //TODO: for testing - these show up correctly.  
        //If I pull 10 stories, I get 10 entries from here that match
        //So "story" is holding the current story
        console.log("items[i] is :" + items[i].title);
        console.log("story title is : " + story.title);

        // setup query to see if it's already in db
        var query = schemas.Stories.findOne({
            "title" : story.title,
            "url" : story.url
        });


        //execute the query
        query.exec( function(err, row){
            if(err) console.log("error-query: " + err);
            console.log("row: "+ row);
            if(!row) {
                // not there, so save
                console.log('about to save story.title: ' + story.title);
                story.save(function (err){
                    console.log("error in save: " + err);
                });
            }
        });

    }
});

当它运行时,我看到的是很多控制台输出:

它开始显示所有故事(很多被省略):

items[i] is :TSA Drops Plan to Let Passengers Carry Small Knives on Planes               
story title is : TSA Drops Plan to Let Passengers Carry Small Knives on Planes           
items[i] is :BUILDING COLLAPSE:1 Reportedly Dead, 13 Pulled From Philly Rubble           
story title is : BUILDING COLLAPSE:1 Reportedly Dead, 13 Pulled From Philly Rubble       
items[i] is :CONTROVERSIAL PAST: Obama's UN Nominee Once Likened US 'Sins' to Nazis'     
story title is : CONTROVERSIAL PAST: Obama's UN Nominee Once Likened US 'Sins' to Nazis' 
items[i] is :WRITING OUT WRIGHTS: Bill Gives First Powered Flight Nod to Whitehead       
story title is : WRITING OUT WRIGHTS: Bill Gives First Powered Flight Nod to Whitehead   
items[i] is :BREAKING NEWS: Rice Named to Top Security Post Despite Libya Fallout        
story title is : BREAKING NEWS: Rice Named to Top Security Post Despite Libya Fallout   

然后继续像(很多省略):

row: null                                                                     
about to save story.title: Best Ribs in America                               
row: null                                                                     
about to save story.title: Best Ribs in America                               
row: null                                                                     
about to save story.title: Best Ribs in America                               
row: null                                                                     
about to save story.title: Best Ribs in America                               
row: null                                                                     
about to save story.title: Best Ribs in America                               
row: null                                                                     
about to save story.title: Best Ribs in America                               
row: { title: 'Best Ribs in America',                                         
  url: 'http://www.foxnews.com/leisure/2013/06/05/10-best-ribs-in-america/',  
  published: 1370463800000,                                                   
  _id: 51af9f881995d40425000023,                                              
  __v: 0 }                                                                    

它重复“即将保存”标题(这是提要中的最后一个故事),并保存该故事一次,就像最后一行显示的那样。

console.log 输出只显示我放的,所有故事标题输出在顶部,然后来自 query.exec() 调用内部的所有内容在底部。

任何帮助表示赞赏...

4

2 回答 2

2

这样做的问题是,一旦回调将被执行,exec 回调中引用的故事将被设置为 for 循环中迭代的最后一件事,因为所有执行的函数都引用了相同的实例多变的。

解决此问题的最简单方法是简单地将 for 循环中的每个内容包装在一个您立即使用参数执行的函数中,如下所示:

rssparser.parseURL(url, options, function(err,out){
    // out.items is an array of the items pulled
    var items = out.items;
    for (var i=0; i<items.length; i++){
        (function(item) {

            //create a mongoose story
            var story = new schemas.Stories({
                title: item.title,
                url: item.url,
                summary: item.summary,
                published: item.published_at
            });

            // setup query to see if it's already in db
            var query = schemas.Stories.findOne({
                "title" : story.title,
                "url" : story.url
            });

            //execute the query
            query.exec( function(err, row){
                if(err) console.log("error-query: " + err);
                console.log("row: "+ row);
                if(!row) {
                    // not there, so save
                    console.log('about to save story.title: ' + story.title);
                    story.save(function (err){
                        console.log("error in save: " + err);
                    });
                }
            });

        })(items[i]);
    }
});

我没有测试过这个,但我相信你会发现它会解决你的问题

如果您的平台支持(node.js 支持),另一种更简单、更清洁、更好的方法是迭代数组上的 forEach 循环中的项目 - 这个版本更漂亮:

rssparser.parseURL(url, options, function(err,out){
    // out.items is an array of the items pulled
    out.items.forEach(function(item) {

        //create a mongoose story
        var story = new schemas.Stories({
            title: item.title,
            url: item.url,
            summary: item.summary,
            published: item.published_at
        });

        // setup query to see if it's already in db
        var query = schemas.Stories.findOne({
            "title" : story.title,
            "url" : story.url
        });

        //execute the query
        query.exec( function(err, row){
            if(err) console.log("error-query: " + err);
            console.log("row: "+ row);
            if(!row) {
                // not there, so save
                console.log('about to save story.title: ' + story.title);
                story.save(function (err){
                    console.log("error in save: " + err);
                });
            }
        });

    });
});
于 2013-06-06T09:54:25.063 回答
2

好吧,node 是事件驱动的服务器,而 javascript 也是事件驱动的,所以你可以异步调用东西。

你需要使用一些异步模式来做你想做的事。

首先,如果您使用的是猫鼬,您可以利用它的模式类来检查已经存在的项目,而无需再次查询数据库:

var mongoose = require('mongoose');

var schema = new mongoose.Schema({
    title: String,
    url: { type: String, unique: true },
    summary: String,
    published: Date

})

var model = mongoose.model('stories', schema)

url 是唯一的,所以保存会导致重复错误,mongoose 不会保存查询。

现在遍历项目并保存每个我们需要某种模式的项目,幸运的是我们为它提供了异步

var async = require('async');

rssparser.parseURL(url, options, function(err, out){
    async.each(out.items, function(item, callback){

        var m = new model({
            title: item.title,
            url: item.url,
            summary: item.summary,
            published: item.published_at
        })

        m.save(function(err, result){
            callback(null)
        });

    }, function(err){
        //we complete the saving we can do stuff here       
    });
}

我们在并行模式下使用异步,因为我们不在乎有些是否重复。您还可以使用一个数组来跟踪它,您可以将错误推送给它|| 结果,这样您就可以看到您保存了多少项目。

于 2013-06-09T22:03:10.527 回答