elasticsearch - 不同字段上 top_hits 上的 Elasticsearch 聚合

Question

我正在尝试对一组过滤的文档执行聚合；但是，过滤器细节需要使用聚合（每个“申请人”的最新“测试”）。顶级聚合将在文档的一个字段上完成，但在与执行过滤聚合的字段不同的字段上。

例如（我在这里建立另一个用户的问题：查询或过滤最小字段值？）。

给定以下一组文档：

{ "test": 1, "applicant":1, "score":90, “topic”:”geometry”},
{ "test": 2, "applicant":2, "score":65, “topic”:”physics” },
{ "test": 3, "applicant":2, "score":88, "topic”:”geometry”},
{ "test": 4, "applicant":1, "score":23, "topic”:”english” },
{ "test": 5, "applicant”:3, "score”:50, "topic”:”physics” },
{ "test": 6, "applicant”:3, "score”:77, "topic”:”english” }

我们有兴趣了解有多少用户在每个类别中得分最高。

换句话说，我们想要：

只过滤每个用户得分最高的测试
根据主题对结果进行分组（和计数）。

因此，对于第 1 步，我们应该只保留：

{ "test": 1, "applicant":1, "score":90, “topic”:”geometry” },
{ "test": 3, "applicant":2, "score":88, "topic”:”geometry” },
{ "test": 5, "applicant”:3, "score”:50, "topic”:”physics”  },
{ "test": 6, "applicant”:3, "score”:77, "topic”:”english”  }

对于第 2 步，按主题对它们进行分组：

{“topic”:”geometry” , “count”: 2}
{“topic”:”physics”  , “count”: 1}
{“topic”:”english”  , “count”: 1}

问题是，如果我使用 aggregation/top_hits 进行过滤：

{
  "aggs": {
    “applicants”: {
      "terms": {
        "field": “applicant”,
        "order" : { “highest_score" : "desc" }
      },
      "aggs": {
        “highest_score": { “max”: { "field": "score" }},
        “highest_score_top_hits": {
          "top_hits": {
            "size":1,
            "sort": [{"score": {"order": "desc"}}]
          }
        }
      }
    }
  }
}

我的第一步是正确的（top_hits），但是如果我通过'topic'添加'parent'聚合，top_hits聚合将不再正常工作，因为'applicants'将在不同的'topic'存储桶之间混合，因此聚合最高分数将不正确。

看起来最好的方法是在“主题”聚合之前使用查询过滤器，但我无法创建这样的过滤器，以便它只保留每个申请人的最高得分测试。

有任何想法吗？

elasticsearch - 不同字段上 top_hits 上的 Elasticsearch 聚合

0 回答 0

Related

Reference