python - Elasticsearch - 使用 Python 使用不同的分析器重新索引单个字段

Question

我在 elasticsearch 中使用动态映射将我的 json 文件加载到 elasticsearch 中，如下所示：

es = Elasticsearch([{'host': 'localhost', 'port': 9200}])

def extract():
    f = open('tmdb.json')
    if f:
        return json.loads(f.read())

movieDict = extract()

def index(movieDict={}):

    for id, body in movieDict.items():
        es.index(index='tmdb', id=id, doc_type='movie', body=body)

index(movieDict)

如何更新单个字段的映射？我有title要添加不同分析器的字段。

title_settings = {"properties" : { "title": {"type" : "text", "analyzer": "english"}}}
es.indices.put_mapping(index='tmdb', body=title_settings)

这失败了。

我知道我无法更新已经存在的索引，但是重新索引从 json 文件生成的映射的正确方法是什么？我的文件有很多字段，手动创建映射/设置会很麻烦。

我可以为查询指定分析器，如下所示：

query = {"query": {
            "multi_match": {
                "query": userSearch, "analyzer":"english", "fields": ['title^10', 'overview']}}}

如何为索引或字段指定它？

我也可以在关闭和打开索引后将分析器设置为设置

analysis = {'settings': {'analysis': {'analyzer': 'english'}}}
es.indices.close(index='tmdb')
es.indices.put_settings(index='tmdb', body=analysis)
es.indices.open(index='tmdb')

复制英语分析仪的确切设置不会为我的数据“激活”它。

https://www.elastic.co/guide/en/elasticsearch/reference/7.6/analysis-lang-analyzer.html#english-analyzer

通过“激活”，我的意思是，搜索不会以英语分析器处理的形式返回，即。仍然有停用词。

score 0 · Accepted Answer

通过大量谷歌搜索解决了它....

您不能更改已索引数据的分析器。这包括指数的开盘/收盘。您可以指定新索引，创建新映射并加载数据（最快的方式）
为整个索引指定分析器不是好的解决方案，因为“英语”分析器特定于“文本”字段。最好按字段指定分析器。
如果按字段指定分析器，则还需要指定类型。
您需要记住，分析器用于可以在/或索引和搜索时间使用。参考指定分析器

代码：

def create_index(movieDict={}, mapping={}):
    es.indices.create(index='test_index', body=mapping)

    start  = time.time()
    for id, body in movieDict.items():
        es.index(index='test_index', id=id, doc_type='movie', body=body)
    print("--- %s seconds ---" % (time.time() - start))

现在，我已经mapping从我的 json 文件的动态映射中得到了。我只是将它保存回 json 文件以便于处理（编辑）。那是因为我要绘制 40 多个字段，手工绘制会很累。

mapping = es.indices.get_mapping(index='tmdb')

这是title应如何指定键以使用english分析器的示例

'title': {'type': 'text', 'analyzer': 'english','fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}}

python - Elasticsearch - 使用 Python 使用不同的分析器重新索引单个字段

1 回答 1

Related

Reference