1

注意:我在输出中只提供了一些文档以保持帖子小而直观

源集合:

{
        "_id" : {
                "SpId" : 840,
                "Scheduler_Id" : 1,
                "Channel_Id" : 2,
                "TweetId" : 15
        },
        "PostDate" : ISODate("2013-10-31T18:30:00Z")
}
{
        "_id" : {
                "SpId" : 840,
                "Scheduler_Id" : 1,
                "Channel_Id" : 2,
                "TweetId" : 16
        },
        "PostDate" : ISODate("2013-10-31T18:30:00Z")
}
{
        "_id" : {
                "SpId" : 840,
                "Scheduler_Id" : 1,
                "Channel_Id" : 2,
                "TweetId" : 17
        },
        "PostDate" : ISODate("2013-10-30T18:30:00Z")
}

第 1 步:按 PostDate 分组

询问 :

db.Twitter_Processed.aggregate({$match : { "_id.SpId" : 840, "_id.Scheduler_Id" : 1 }},{$project:{SpId : "$_id.SpId",Scheduler_Id : "$_id.Scheduler_Id",day:{$dayOfMonth:'$PostDate'},month:{$month:'$PostDate'},year:{$year:'$PostDate'}, senti : "$Sentiment"}}, {$group : {_id : {SpId : "$SpId", Scheduler_Id : "$Scheduler_Id",day:'$day',month:'$month',year:'$year'}, sentiment : { $sum : "$senti"}}}, {$group : {_id : "$_id" , avgSentiment : {$avg : "$sentiment"}}})

输出 :

{
        "result" : [
                {
                        "_id" : {
                                "SpId" : 840,
                                "Scheduler_Id" : 1,
                                "day" : 31,
                                "month" : 10,
                                "year" : 2013
                        },
                        "avgSentiment" : 2.2700000000000005
                },
                {
                        "_id" : {
                                "SpId" : 840,
                                "Scheduler_Id" : 1,
                                "day" : 30,
                                "month" : 10,
                                "year" : 2013
                        },
                        "avgSentiment" : 4.96
                }
}

第 2 步:尝试实现这一目标:

{
        "result" : [
                {
                        "_id" : {
                                "SpId" : 840,
                                "Scheduler_Id" : 1,
                 "Date" : ISODate("2013-10-31T18:30:00Z")
                        },
                        "avgSentiment" : 2.2700000000000005
                },
                {
                        "_id" : {
                                "SpId" : 840,
                                "Scheduler_Id" : 1,
                "Date" : ISODate("2013-10-31T18:30:00Z")
                        },
                        "avgSentiment" : 4.96
                }
}

尝试的查询:

db.Twitter_Processed.aggregate({$match : { "_id.SpId" : 840, "_id.Scheduler_Id" : 1 }},{$project:{SpId : "$_id.SpId",Scheduler_Id : "$_id.Scheduler_Id",day:{$dayOfMonth:'$PostDate'},month:{$month:'$PostDate'},year:{$year:'$PostDate'}, senti : "$Sentiment"}}, {$group : {_id : {SpId : "$SpId", Scheduler_Id : "$Scheduler_Id",day:'$day',month:'$month',year:'$year'}, sentiment : { $sum : "$senti"}}}, {$group : {_id : "$_id" , avgSentiment : {$avg : "$sentiment"}}}, {$project : {_id : {SpId : "$_id.SpId",Scheduler_Id : "$_id.Scheduler_Id", date : new Date("$_id.year","$_id.month","$_id.day")}, avgSentiment : "$avgSentiment"}})

输出(错误):

Error: Printing Stack Trace
    at printStackTrace (src/mongo/shell/utils.js:37:15)
    at DBCollection.aggregate (src/mongo/shell/collection.js:897:9)
    at (shell):1:22
Tue Dec 31 09:41:42.916 JavaScript execution failed: aggregate failed: {
        "errmsg" : "exception: disallowed field type Date in object expression (
at 'date')",
        "code" : 15992,
        "ok" : 0
} at src/mongo/shell/collection.js:L898

如何实现第 2 步?

4

1 回答 1

3

正如您所注意到的,聚合框架(如 MongoDB 2.4)有操作符来提取部分日期,但不能轻松创建日期字段。

有一篇关于Stupid date tricks with Aggregation Framework的精彩博客文章提供了一种创造性的解决方法:使用$projectbefore you截断日期粒度$group

db.Twitter_Processed.aggregate(

    // Match (can take advantage of suitable index)
    { $match : {
        "_id.SpId" : 840,
        "_id.Scheduler_Id" : 1
    }},

    // Extract h/m/s/ms values from PostDate for rounding
    { $project: {
        SpId : "$_id.SpId",
        Scheduler_Id : "$_id.Scheduler_Id",
        PostDate : "$PostDate",
        h  : { "$hour"   : "$PostDate" },
        m  : { "$minute" : "$PostDate" },
        s  : { "$second" : "$PostDate" },
        ms : { "$millisecond" : "$PostDate" },
        senti : "$Sentiment"
    }},

    // Subtract the h/m/s/ms values to round the date off to yyyy-mm-dd
    { $project: {
        SpId : "$_id.SpId",
        Scheduler_Id : "$_id.Scheduler_Id",

        // PostDate will end up truncated to yyyy-mm-dd granularity
        PostDate: {
            "$subtract" : [
                "$PostDate",
                {
                    "$add" : [
                        "$ms",
                        { "$multiply" : [ "$s", 1000 ] },
                        { "$multiply" : [ "$m", 60, 1000 ] },
                        { "$multiply" : [ "$h", 60, 60, 1000 ]}
                    ]
                }
            ]
        },
        senti: "$Sentiment"
    }},

    { $group : {
        _id : {
            SpId : "$SpId",
            Scheduler_Id : "$Scheduler_Id",
            PostDate: "$PostDate"
        },
        sentiment : { $sum : "$senti"}
    }},

    { $group : {
        _id : "$_id" ,
        avgSentiment : {$avg : "$sentiment"}
    }}
)
于 2013-12-31T05:40:14.797 回答