3

我正在尝试找到一个 mongoDB 脚本,它将查看一个集合,其中有同一个文档的多个记录,并且只为我提供每个文档的最新版本作为结果集。

我无法用英语比上面更好地解释它,但也许下面的这个小 SQL 可能会进一步解释它。我想要每个文档,transaction_reference但只有最新的日期版本(object_creation_date)。

select 
    t.transaction_reference, 
    t.transaction_date, 
    t.object_creation_date,
    t.transaction_sale_value
from MyTable t
inner join (
    select 
        transaction_reference, 
        max(object_creation_date) as MaxDate
    from MyTable
    group by transaction_reference
) tm 
    on t.transaction_reference = tm.transaction_reference 
    and t.object_creation_date = tm.MaxDat

同一个文档有多个版本的原因是因为我想存储一个事务的每次迭代。我第一次收到文件时,它可能是transaction_statusUNPAID,然后我再次收到相同的交易,这次transaction_status是 PAID。

一些分析将对所有唯一交易求和,而另一些分析可能是测量状态为 UNPAID 的文档与下一个 PAID 文档之间的时间距离。

根据要求,这里有两个文件:

{
"_id": {
    "$oid": "579aa337f36d2808839a05e8"
},
"object_class": "Goods & Services Transaction",
"object_category": "Revenue",
"object_type": "Transaction",
"object_origin": "Sage One",
"object_origin_category": "Bookkeeping",
"object_creation_date": "2016-07-05T00:00:00.201Z",
"party_uuid": "dfa1e80a-5521-11e6-beb8-9e71128cae77",
"connection_uuid": "b945bd7c-7988-4d2a-92f5-8b50ab218e00",
"transaction_reference": "SI-1",
"transaction_status": "UNPAID",
"transaction_date": "2016-06-16T00:00:00.201Z",
"transaction_due_date": "2016-07-15T00:00:00.201Z",
"transaction_currency": "GBP",
"goods_and_services": [
    {
        "item_identifier": "PROD01",
        "item_name": "Product One",
        "item_quantity": 1,
        "item_gross_unit_sale_value": 1800,
        "item_revenue_category": "Sales Revenue",
        "item_net_unit_cost_value": null,
        "item_net_unit_sale_value": 1500,
        "item_unit_tax_value": 300,
        "item_net_total_sale_value": 1500,
        "item_gross_total_sale_value": 1800,
        "item_tax_value": 300
    }
],
"transaction_gross_value": 1800,
"transaction_gross_curr_value": 1800,
"transaction_net_value": 1500,
"transaction_cost_value": null,
"transaction_payments_value": null,
"transaction_payment_extras_value": null,
"transaction_tax_value": 300,
"party": {
    "customer": {
        "customer_identifier": "11",
        "customer_name": "KP"
    }
}
}

现在支付的第二个版本

{
"_id": {
    "$oid": "579aa387f36d2808839a05ee"
},
"object_class": "Goods & Services Transaction",
"object_category": "Revenue",
"object_type": "Transaction",
"object_origin": "Sage One",
"object_origin_category": "Bookkeeping",
"object_creation_date": "2016-07-16T00:00:00.201Z",
"party_uuid": "dfa1e80a-5521-11e6-beb8-9e71128cae77",
"connection_uuid": "b945bd7c-7988-4d2a-92f5-8b50ab218e00",
"transaction_reference": "SI-1",
"transaction_status": "PAID",
"transaction_date": "2016-06-16T00:00:00.201Z",
"transaction_due_date": "2016-07-15T00:00:00.201Z",
"transaction_currency": "GBP",
"goods_and_services": [
    {
        "item_identifier": "PROD01",
        "item_name": "Product One",
        "item_quantity": 1,
        "item_gross_unit_sale_value": 1800,
        "item_revenue_category": "Sales Revenue",
        "item_net_unit_cost_value": null,
        "item_net_unit_sale_value": 1500,
        "item_unit_tax_value": 300,
        "item_net_total_sale_value": 1500,
        "item_gross_total_sale_value": 1800,
        "item_tax_value": 300
    }
],
"transaction_gross_value": 1800,
"transaction_gross_curr_value": 1800,
"transaction_net_value": 1500,
"transaction_cost_value": null,
"transaction_payments_value": null,
"transaction_payment_extras_value": null,
"transaction_tax_value": 300,
"party": {
    "customer": {
        "customer_identifier": "11",
        "customer_name": "KP"
    }
}
}

感谢您的支持,马特

4

1 回答 1

1

如果我正确理解了这个问题,你可以使用这样的东西

db.getCollection('yourTransactionsCollection').aggregate([
    {
        $sort: {
            "transaction_reference": 1,
            "object_creation_date": -1
        }
    },
    {
        $group: {
            _id: "$transaction_reference",
            "transaction_date": { $first: "$transaction_date" },
            "object_creation_date": { $first: "$transaction_date" },
            "transaction_sale_value": { $first: "$transaction_sale_value" }
        }
    }
])

输出如下结果

{
    "_id" : "SI-1",
    "transaction_date" : "2016-06-16T00:00:00.201Z",
    "object_creation_date" : "2016-06-16T00:00:00.201Z",
    "transaction_sale_value" : null
}

请注意,您可以将 更改$sort为仅包含object_creation_date但我将两者都包含在内transaction_referenceobject_creation_date因为我认为在它们两者上创建复合索引而不是仅创建日期是有意义的。根据您的索引进行调整,以便$sort达到一个。
此外,没有文档字段,transaction_sale_value因此null在结果中为它。也许您错过了这一点,或者它只是不在您的示例文档中,但我认为您明白了这个想法并可以根据您的需要进行调整。

于 2016-07-29T05:06:08.697 回答