7

考虑以下存储在 CouchDB 中的示例文档

 {
"_id":....,
"rev":....,
"type":"orders",
"Period":"2013-01",
"Region":"East",
"Category":"Stationary",
"Product":"Pen",
"Rate":1,
"Qty":10,
"Amount":10
}

{
"_id":....,
"rev":....,
"type":"orders",
"Period":"2013-02",
"Region":"South",
"Category":"Food",
"Product":"Biscuit",
"Rate":7,
"Qty":5,
"Amount":35
}

考虑以下 SQL 查询

SELECT Period, Region,Category, Product, Min(Rate),Max(Rate),Count(Rate), Sum(Qty),Sum(Amount)
FROM Sales
GROUP BY Period,Region,Category, Product;

是否可以在 couchdb 中创建与上述 ​​SQL 查询等效的 map/reduce 视图并产生类似的输出

[
    {
        "Period":"2013-01",
        "Region":"East",
        "Category":"Stationary",
        "Product":"Pen",
        "MinRate":1,
        "MaxRate":2,
        "OrdersCount":20,
        "TotQty":1000,
        "Amount":1750
    },
    {
    ... 
    }

]
4

2 回答 2

4

我将提出一个非常简单的解决方案,该解决方案要求您在“选择”子句中聚合的每个变量都有一个视图。虽然在单个视图中聚合所有变量当然是可能的,但 reduce 函数会复杂得多。

设计文档如下所示:

{
    "_id": "_design/ddoc",
    "_rev": "...",
    "language": "javascript",
    "views": {
        "rates": {
            "map": "function(doc) {\n  emit([doc.Period, doc.Region, doc.Category, doc.Product], doc.Rate);\n}",
            "reduce": "_stats"
        },
        "qty": {
            "map": "function(doc) {\n  emit([doc.Period, doc.Region, doc.Category, doc.Product], doc.Qty);\n}",
            "reduce": "_stats"
        }
    }
}

现在,您可以查询<couchdb>/<database>/_design/ddoc/_view/rates?group_level=4以获取有关“Rate”变量的统计信息。结果应如下所示:

{"rows":[
{"key":["2013-01","East","Stationary","Pen"],"value":{"sum":4,"count":3,"min":1,"max":2,"sumsqr":6}},
{"key":["2013-01","North","Stationary","Pen"],"value":{"sum":1,"count":1,"min":1,"max":1,"sumsqr":1}},
{"key":["2013-01","South","Stationary","Pen"],"value":{"sum":0.5,"count":1,"min":0.5,"max":0.5,"sumsqr":0.25}},
{"key":["2013-02","South","Food","Biscuit"],"value":{"sum":7,"count":1,"min":7,"max":7,"sumsqr":49}}
]}

对于“数量”变量,查询将是<couchdb>/<database>/_design/ddoc/_view/qty?group_level=4.

使用该group_level属性,您可以控制要执行聚合的级别。例如,查询 withgroup_level=2将聚合到“Period”和“Region”。

于 2013-04-29T12:30:53.760 回答
4

在前面,我相信@benedolph 的答案是最佳实践和最佳案例方案。理想情况下,每个 reduce 应该返回 1 个标量值,以使代码尽可能简单。

但是,您确实必须发出多个查询来检索您的问题描述的完整结果集。如果您没有并行运行查询的选项,或者减少查询数量非常重要,则可以一次完成所有操作。

您的地图功能将保持非常简单:

function (doc) {
    emit([ doc.Period, doc.Region, doc.Category, doc.Product ], doc);
}

reduce 函数变得冗长:

function (key, values, rereduce) {
    // helper function to sum all the values of a specified field in an array of objects
    function sumField(arr, field) {
        return arr.reduce(function (prev, cur) {
            return prev + cur[field];
        }, 0);
    }

    // helper function to create an array of just a single property from an array of objects
    // (this function came from underscore.js, at least it's name and concept)
    function pluck(arr, field) {
        return arr.map(function (item) {
            return item[field];
        });
    }

    // rereduce made this more challenging, and I could not thoroughly test this right now
    // see the CouchDB wiki for more information
    if (rereduce) {
        // a rereduce handles transitionary values
        // (so the "values" below are the results of previous reduce functions, not the map function)
        return {
            OrdersCount: sumField(values, "OrdersCount"),
            MinRate: Math.min.apply(Math, pluck(values, "MinRate")),
            MaxRate: Math.max.apply(Math, pluck(values, "MaxRate")),
            TotQty: sumField(values, "TotQty"),
            Amount: sumField(values, "Amount")
        };
    } else {
        var rates = pluck(values, "Rate");

        // This takes a group of documents and gives you the stats you were asking for
        return {
            OrdersCount: values.length,
            MinRate: Math.min.apply(Math, rates),
            MaxRate: Math.max.apply(Math, rates),
            TotQty: sumField(values, "Qty"),
            Amount: sumField(values, "Amount")
        };
    }
}

我根本无法测试这段代码的“rereduce”分支,你必须自己做。(但这应该有效)有关 reduce 与 rereduce 的信息,请参阅wiki

我在顶部添加的辅助函数实际上使代码整体更短且更易于阅读,它们在很大程度上受到我使用Underscore.js的经验的影响。但是,您不能在 reduce 函数中包含 CommonJS 模块,因此必须手动编写。

同样,最好的情况是让每个聚合字段都有自己的 map/reduce 索引,但如果这不是你的选择,上面的代码应该让你得到你在问题中描述的内容。

于 2013-04-29T14:45:34.347 回答