1

我的问题是:Elasticsearch 计数与我的数据库不同。

我索引了“用户”表,每个用户可以有一个或多个 apps_events :

curl localhost:9200/users/_count
{"count":190291,"_shards":{"total":5,"successful":5,"failed":0}}

SELECT COUNT(*) FROM users
count : 190291

=> 相同的计数,一切正常!

但是,当我对 2 个过滤器进行搜索时,一个术语和一个术语是嵌套资源:

curl -X GET 'http://localhost:9200/users/user/_search?load=&size=10&pretty' -d '
{
"query": {
  "match_all": {
  }
},
"filter": {
  "and": [
    {
      "terms": {
        "apps_events.type": [
          "sale"
        ]
      }
    },
    {
      "term": {
        "apps_events.status": "active"
      }
    }
  ]
},
"size": 10
}

total : 63756

在我的数据库中:

SELECT
  COUNT(DISTINCT(users_id))
FROM
  apps_event
WHERE
  apps_event_state_id = 1 AND apps_event_project_id = 2;

count : 63340

因为实际上elasticsearch SQL等效查询是:

SELECT
  COUNT(DISTINCT(users_id))
FROM apps_event
WHERE apps_event_state_id = 1
AND users_id IN
  (SELECT DISTINCT(users_id) FROM apps_event WHERE apps_event_project_id = 2)

count : 63756

===> 我如何为每个资源做一个简单的“与”?

谢谢

4

2 回答 2

0

您可能已经检查过这个,但apps_event_project_id正确的推论是apps_events.type?从表面上看,它们似乎不一样,但您肯定会知道。另外,是否users_id直接映射到 ES _id?可能是您的索引中有重复项,这增加了它的计数。

于 2013-02-18T16:17:19.677 回答
0

“嵌套资源”的最佳资源: http ://www.spacevatican.org/2012/6/3/fun-with-elasticsearch-s-children-and-nested-documents/

于 2013-02-18T17:24:25.853 回答