我正在使用 MongoDB 聚合框架来尝试从我们的数据集中收集一些总数。
以下是源数据的示例:
{
"urn": "urn:content:epi:0001",
"duration": 3450272,
"profiles": {
"low": {
"mp3": {
"size": 8623059425,
"url": "0001_low.mp3"
},
"oga": {
"size": 8623059425,
"url": "0001_low.oga"
},
"m4a": {
"size": 8623059425,
"url": "0001_low.m4a"
}
},
"medium": {
"mp3": {
"size": 8623059425,
"url": "0001_medium.mp3"
},
"oga": {
"size": 8623059425,
"url": "0001_medium.oga"
},
"m4a": {
"size": 8623059425,
"url": "0001_medium.m4a"
}
},
"high": {
"mp3": {
"size": 8623059425,
"url": "0001_high.mp3"
},
"oga": {
"size": 8623059425,
"url": "0001_high.oga"
},
"m4a": {
"size": 8623059425,
"url": "0001_high.m4a"
}
}
}
}
我想要做的是将每个单独的文档/项目分离profile.(low|medium|high).(mp3|oga|m4a)
成一个单独的文档/项目以进行聚合,例如:
{
"_id": null,
"files": [
{
"urn": "urn:content:epi:0001",
"duration": 3450272,
"size": 8623059425,
"url": "0001_low.mp3"
},
{
"urn": "urn:content:epi:0001",
"duration": 3450272,
"size": 8623059425,
"url": "0001_low.oga"
},
{
"urn": "urn:content:epi:0001",
"duration": 3450272,
"size": 8623059425,
"url": "0001_low.m4a"
},
{
"urn": "urn:content:epi:0001",
"duration": 3450272,
"size": 8623059425,
"url": "0001_medium.mp3"
},
{
"urn": "urn:content:epi:0001",
"duration": 3450272,
"size": 8623059425,
"url": "0001_medium.oga"
},
{
"urn": "urn:content:epi:0001",
"duration": 3450272,
"size": 8623059425,
"url": "0001_medium.m4a"
},
{
"urn": "urn:content:epi:0001",
"duration": 3450272,
"size": 8623059425,
"url": "0001_high.mp3"
},
{
"urn": "urn:content:epi:0001",
"duration": 3450272,
"size": 8623059425,
"url": "0001_high.oga"
},
{
"urn": "urn:content:epi:0001",
"duration": 3450272,
"size": 8623059425,
"url": "0001_high.m4a"
}
]
}
使用聚合框架是否可以做到这一点,或者这在 MapReduce 上可以做到吗?