0

我有一个 Elasticsearch 数据库,其中包含多个字段,其中可以包含名称信息,并尝试像这样搜索它:

from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search

client = Elasticsearch()
s = Search(using=client, index="names")
query = 'smith'
fields = ['name1', 'name2']

results = s.query("multi_match", query=query, fields=fields, fuzziness='AUTO')

for hit in results.scan():
    print(hit.meta.score)

结果是:

None
None
None
...

但是,如果我手动构建它:

results = client.search(index="names",
    body={"size": 100, "query":{
        "multi_match": {
            "query": query, "fields": fields, "fuzziness": 'AUTO'
        }
    }
})

我的结果是:

{'_index': 'names', '_type': 'Name1', '_id': '1MtYSW4BXryTHXwQ1xBS', '_score': 14.226202, '_source': {...}
{'_index': 'names', '_type': 'Name1', '_id': 'N8tZSW4BXryTHXwQHBfw', '_score': 14.226202, '_source': {...}
{'_index': 'names', '_type': 'Name1', '_id': '8MtZSW4BXryTHXwQeR-i', '_score': 14.226202, '_source': {...}

如果可能的话,我更喜欢使用 elasticsearch-dsl,但我需要分数信息。

4

4 回答 4

3

第一版代码不等同于第二版代码。第一个版本不执行查询,而是使用 Scroll API (elasticsearch.helpers.scan)。

Search.query()方法构建或扩展搜索对象,而不是向 elasticsearch 发送查询。因此,以下代码行具有误导性:

results = s.query("multi_match", query=query, fields=fields, fuzziness='AUTO')

它应该是这样的:

# execute() added at the end
results = s.query("multi_match", query=query, fields=fields, fuzziness='AUTO').execute()
# scan() removed 
for hit in results:
    print(hit.meta.score)
于 2019-11-11T21:00:08.890 回答
1

试试这个:

from elasticsearch_dsl.query import MultiMatch
from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search

client = Elasticsearch()
s = Search(using=client, index="names")
query = 'smith'
fields = ['name1', 'name2']

query_multi = 
MultiMatch(query=query,fields=fields,fuzziness='AUTO')

r = s.query(query_multi)
results = r.execute()
for hit in results:
    print(hit.meta.score)
于 2019-11-08T19:20:18.747 回答
0

试试这样:

results = s.query("multi_match", query=query, fields=fields, fuzziness='AUTO')
for hit in results["hits"]["hits"]:
    print(hit._score)
于 2019-11-08T17:31:06.247 回答
0

试试这个:

s = s.params(preserve_order=True).sort("_score")

然后scan可以返回score

默认情况下,扫描将填充排序['_doc'],这就是它不会返回分数的原因。

于 2021-04-14T12:27:48.890 回答