1

this is the command and the error generate:

db.tweets.aggregate(
    {$project:{'entities.hashtags.text':1}},
    {$unwind:'$entities.hashtags'},
    {$group:{_id:'$entities.hashtags.text'}})

{
    "errmsg" : "exception: aggregation result exceeds maximum document size (16MB)",
    "code" : 16389,
    "ok" : 0
}

i would want to do a follow query:

group by entities.hashtags.text and count the number of document that contains that hashtags for every hashtags taht exist.

this is a part of document:

...

entities: {

   media: [

         ...

    ],

    urls: [],

    hashtags: [

        {

            text: "makeuploos",

            indices: [

                54,

                65

            ]

        },

        {

            text: "onbewerkt",

            indices: [

                66,

                76

            ]

        },

        {

            text: "hoer",

            indices: [

                77,

                82

            ]

        }

    ],

...

how can i do this??

4

2 回答 2

0

在展开后添加一个$where地方并尝试仅匹配相关数据。你只需要许多不同的主题标签,它们不适合 16MB 的限制。

于 2012-12-29T12:21:12.497 回答
0

从 MongoDB v.2.6 开始,您可以使用选项allowDiskUse。例如:

  db.tweets.aggregate(
    [
      {$project:{'entities.hashtags.text':1}},
      {$unwind:'$entities.hashtags'},
      {$group:{_id:'$entities.hashtags.text'}}
    ],
    {
      allowDiskUse: true
    }
  )

这将允许将数据写入临时文件。您可以在此处找到更多信息:http: //docs.mongodb.org/manual/core/aggregation-pipeline-limits/#agg-memory-restrictions

于 2014-11-27T06:07:27.277 回答