这就是我的 Post 集合中的文档结构。
{
"_id" : ObjectId("51ed845c92964454ee85fa55"),
"_cls" : "Post",
"user" : ObjectId("51eb52ba8f24a04aa4761cb8"),
"tags" : [
{
"tag" : ObjectId("51eb52be8f24a04aa4761cbe"),
"name" : "rugby"
},
{
"tag" : ObjectId("51eb52ba8f24a04aa4761cb8"),
"name" : "john smith"
}
],
"content" : "my fourth post!",
"upvotes" : 1,
"downvotes" : 0,
"rank" : 5343.95577,
"replies" : [],
"date_created" : ISODate("2013-07-22T15:13:32.650Z"),
"date_modified" : ISODate("2013-07-22T15:13:32.650Z")
},
{
"_id" : ObjectId("51ec9b188f24a04ff0bd8668"),
"_cls" : "Post",
"user" : ObjectId("51eb52ba8f24a04aa4761cb8"),
"tags" : [
{
"tag" : ObjectId("51eb52be8f24a04aa4761cba"),
"name" : "rugby"
}
],
"content" : "http://www.livememe.com/ebejev5 blah blah",
"upvotes" : 1,
"downvotes" : 0,
"replies" : [],
"date_created" : ISODate("2013-07-21T22:38:16.000Z"),
"date_modified" : ISODate("2013-07-21T22:38:16.000Z"),
"rank" : 5000
}
我要做的基本上是根据每个标签的最高排名来标准化排名。所以我可以以编程方式采取的步骤是:
1. Get posts 2. Group by tags(the first tag in each tags array is fine) 3. Calculate the highest ranked post for each tag(max),(trank), 4. For each post with said tag, calculate new value in each tag, calling it erank, by dividing. basically each post has new value: erank = rank/trank 5.The final output sorted by erank.
我正在尝试在 mongodb 中进行聚合。这就是我目前所拥有的,它只计算每个标签的最高排名。
db.post.aggregate( {
"$unwind" : "$tags"
},{$group: { _id : {tag: "$tag",rank : "$rank"},"tagName" : {
"$first" : "$tags"
}}},
{$group: { _id : {tag: "$tagName.tag"}, trank : {$max : "$_id.rank"}}})
编辑:好的,我相信我想通了,仍然需要进行更多测试,但似乎有效。
db.post.aggregate(
{ "$match" : { "tags" : { "$elemMatch" : { "tag" : { "$in" : [ ObjectId("51eb52ba8f24a04aa4761cb8")]}}}}},
{$unwind : "$tags"},
{$group: {_id : "$tags.tag",name :{$first: "$tags.name"},"posts" : {$addToSet : {postid : "$_id", content : "$content",
rank : "$rank",user : "$user", upvotes : "$upvotes", downvotes : "$downvotes", date_created : "$date_created", date_modified : "$date_modified"
}},
"trank" : {$max : "$rank"}}},
{$unwind : "$posts"},
{$project:{ _id : "$posts.postid", rank : "$posts.rank",content : "$posts.content",user : "$posts.user", tag : {id :"$_id",name : "$name" },
upvotes : "$posts.upvotes", downvotes : "$posts.downvotes", date_created : "$posts.date_created", date_modified : "$posts.date_modified",
erank : {$divide : ["$posts.rank","$trank"]}}},
// {$project:{content : 1,tags : 1, _id : 1,erank: 1,posts : 1, tag : 1,rank : 1}},
{$group : {_id : "$_id","tags" : {$addToSet : { tag : "$tag.id",name: "$tag.name"
}},"erank" : {$max : "$erank"
},"content" :{$first : "$content"},"rank" :{$first : "$rank"},"user" :{$first : "$user"},
upvotes : {$first : "$upvotes"}, downvotes : {$first : "$downvotes"}, date_created : {$first : "$date_created"}
, date_modified :{$first: "$date_modified"}
}},
{$sort : {"erank" : -1,"_id" : 1}},
{$skip : 0},
{$limit : 25}
)