elasticsearch - 为什么在 ES 7.6.1 中从 ES 6.4.2 恢复快照后，我的索引文档不能按 id 获取？

Question

在将我的 ES 集群从 6.4.2 升级到 7.6.1 并恢复旧集群的快照后，一些给定索引上的文档不再可以通过 id 获取。

这在恢复快照后不起作用。

GET myindex/_doc/c1d89b00-d030-11e3-bd52-f3718ac695f3

如果我复制文件：

PUT myindex/_doc/c1d89b00-d030-11e3-bd52-f3718ac695f3
{
   "name" : "dogs and cats",
   "notes" : "Imported",
   "myid" : "c1d89b00-d030-11e3-bd52-f3718ac695f3" // yes, it's redundant
}

这突然起作用了：

GET myindex/_doc/c1d89b00-d030-11e3-bd52-f3718ac695f3

但是，现在我有两个具有相同 ID 的文档。

（更新不起作用，因为文档无法通过 ID 获取）

索引定义：

GET myindex
{
  "myindex" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "merge_id" : {
          "type" : "keyword"
        },
        "name" : {
          "type" : "text",
          "analyzer" : "index_ngram",
          "search_analyzer" : "search_ngram"
        },
        "notes" : {
          "type" : "text",
          "analyzer" : "index_ngram",
          "search_analyzer" : "search_ngram"
        },
        "myid" : {
          "type" : "keyword"
        }
      }
    },
    "settings" : {
      "index" : {
        "max_ngram_diff" : "48",
        "number_of_shards" : "5",
        "provided_name" : "myindex",
        "creation_date" : "1584420860612",
        "analysis" : {
          "filter" : {
            "my_ngram" : {
              "type" : "ngram",
              "min_gram" : "2",
              "max_gram" : "50"
            }
          },
          "analyzer" : {
            "index_ngram" : {
              "filter" : [
                "lowercase",
                "my_ngram"
              ],
              "type" : "custom",
              "tokenizer" : "keyword"
            },
            "default" : {
              "tokenizer" : "keyword"
            },
            "search_ngram" : {
              "filter" : "lowercase",
              "type" : "custom",
              "tokenizer" : "keyword"
            }
          }
        },
        "number_of_replicas" : "0",
        "uuid" : "uyp_WK3xRjucFRGhYDHbcQ",
        "version" : {
          "created" : "7060199"
        }
      }
    }
  }
}

最有趣的部分是我有其他索引（使用不同的 id 格式），它们的数据从同一个快照恢复，升级后他们的文档继续可以通过 id 获取。

score 0 · Accepted Answer

不知何故，在恢复旧集群的快照后，无法通过其 id 获取文档的问题似乎与该索引上使用的分片数量有关。

因此，使用如下所示的单个分片将索引缩小为新索引，可以解决问题：

PUT /myindex/_settings
{
  "settings": {
    "index.routing.allocation.require._name": "instance-0000000000", 
    "index.blocks.write": true 
  }
}

POST myindex/_shrink/myindex_shrinked
{
  "settings": {
    "index.number_of_replicas": 0,
    "index.number_of_shards": 1, 
    "index.codec": "best_compression" 
  },
  "aliases": {
    "my_search_indices": {}
  }
}

PUT /myindex_shrinked/_settings
{
  "settings": {
    "index.routing.allocation.require._name": null, 
    "index.blocks.write": true 
  }
}

elasticsearch - 为什么在 ES 7.6.1 中从 ES 6.4.2 恢复快照后，我的索引文档不能按 id 获取？

1 回答 1

Related

Reference