1

如何从查询的缓存副本中获取分页查询中的数据?

搜索时,我们默认得到 10 个结果(最大)。我们还可以指定“size”和“from”。

但是,(查看一个简单的查询,只是为了使其更简单)我想知道,如果我这样分页:

curl -XPOST 'http://localhost:9200/index1/type1/_search' -d '{
  "query": {
    "match_all": {}
  },
  "from": 0,
  "size": 10
}'

curl -XPOST 'http://localhost:9200/index1/type1/_search' -d '{
  "query": {
    "match_all": {}
  },
  "from": 10,
  "size": 10
}'

curl -XPOST 'http://localhost:9200/index1/type1/_search' -d '{
  "query": {
    "match_all": {}
  },
  "from": 20,
  "size": 10
}'

是不是每次都在服务器上执行查询,然后返回一个“页面”?还是只在第一次缓存和执行查询?

我可以看到两种用例的用途:

  1. 如果每次都重新执行,这将反映可能发生的文档更改。
  2. 如果它被缓存,它将在服务器上产生更少的负载。具体来说,这可用于创建从服务器到客户端的某个“reducer”的“流式传输”。(在这种情况下,我希望查询返回到下一页的链接)。

我该如何执行这两种情况。哪一个是默认的?

另外,如果我的查询运行排序脚本会发生什么?例如:

curl -XPOST 'http://localhost:9200/index1/type1/_search' -d '{
  "query": {
    "match_all": {}
  },
  "sort": {
    "_script": {
      "script": "Math.random()",
      "type": "number",
      "order": "asc"
    }
  },
  "from": 0,
  "size": 10
}'

curl -XPOST 'http://localhost:9200/index1/type1/_search' -d '{
  "query": {
    "match_all": {}
  },
  "sort": {
    "_script": {
      "script": "Math.random()",
      "type": "number",
      "order": "asc"
    }
  },
  "from": 10,
  "size": 10
}'

随机排序会应用两次(所以我可能会在两个查询中都出现一些项目)?如何防止这种情况并将查询“锁定”到分页?

4

1 回答 1

1

Two year old question, unanswered. I'm answering because I hate coming across unanswered questions and I'm doing my bit.

A feature that ElasticSearch offers is the Scroll API (available back to v0.9 and still available in 1.5 with little changes)

This feature allows you to store a cached query result set (default expiry is 1 minute). Unless you make another follow up query within this 1m the query result set will be sent to the shards for a newer version.

This is very handy when you have a lot of live and moving data. Particularly useful when you're migrating data to/from indexes during a migration or update of mapping.

于 2015-05-19T03:07:46.820 回答