python - Elasticsearch 滚动结束不返回任何内容

Question

我正在使用适用于 Python 的 Elasticsearch 6.1 API，并尝试从数据库中的每个文档（303 958 个文档）中读取某个值。

doc = {
    'size' : 1000,
    'query' : {
        'match_all' : {}
    }
}

samplesCount = 0

res = es.search(index="index", doc_type='data', body=doc, scroll='1m')
scrollId = res['_scroll_id']

scrollSize = res['hits']['total']

while scrollSize > 0 :
    for x in range (0, len(res['hits']['hits']) - 1) :
        name = res['hits']['hits'][x]['_source']['name']
        samplesCount += 1
        print(str(samplesCount) + '. ' + name)
        scrollSize -= 1

    res = es.scroll(scroll_id=scrollId, scroll='1m')

索引（samplesCount）在 303 654 处结束，似乎 es.scroll 没有为剩余文档返回任何结果（大约 300，小于滚动大小）。

让我好奇的是，它以 303 654 结尾……我希望是一个整数（1000 的倍数）。

有任何想法吗？

非常感谢您提供任何有用的提示。

score 1 · Accepted Answer

尝试更换

range (0, len(res['hits']['hits']) - 1)

和

range(0, len(res['hits']['hits']))

或（等效地）

range(len(res['hits']['hits']))

从您引用的语法和数字来看，您似乎在while循环的每次迭代中跳过了 1 条记录。

python - Elasticsearch 滚动结束不返回任何内容

1 回答 1

Related

Reference