我是elasticsearch的新手,当我应该匹配多个搜索词以及匹配嵌套文档时,我的查询很慢,基本上第一次查询需要7-10秒,由于elasticsearch缓存需要5-6秒,但是仅使用 match 查询非嵌套对象的速度很快,即在 100 毫秒内。
我在具有 250GB RAM 和 500GB 磁盘空间的 aws 实例中运行弹性搜索,我有一个模板和 204 个索引,在单个节点中索引了大约 1.07 亿个文档,每个索引有 2 个分片,并且我保持了 30GB 堆大小。
我可以有超过 50k 的嵌套对象,所以我将长度增加到 500k,搜索这个嵌套对象需要太多时间,并且对嵌套以外的字段的任何 OR(应该匹配)操作也需要时间,有什么办法可以提升我对嵌套对象的查询性能?或者我的配置有什么问题吗?有什么方法可以让第一次查询也更快?
{
"index_patterns": [
"product_*"
],
"template": {
"settings": {
"index.store.type": "mmapfs",
"number_of_shards":2,
"number_of_replicas": 0,
"index": {
"store.preload": [
"*"
],
"mapping.nested_objects.limit": 500000,
"analysis": {
"analyzer": {
"cust_product_name": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"english_stop",
"name_wordforms",
"business_wordforms",
"english_stemmer",
"min_value"
],
"char_filter": [
"html_strip"
]
},
"entity_name": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"english_stop",
"business_wordforms",
"name_wordforms",
"english_stemmer"
],
"char_filter": [
"html_strip"
]
},
"cust_text": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"english_stop",
"name_wordforms",
"english_stemmer",
"min_value"
],
"char_filter": [
"html_strip"
]
}
},
"filter": {
"min_value": {
"type": "length",
"min": 2
},
"english_stop": {
"type": "stop",
"stopwords": "_english_"
},
"business_wordforms": {
"type": "synonym",
"synonyms_path": "<some path>/business_wordforms.txt"
},
"name_wordforms": {
"type": "synonym",
"synonyms_path": "<some path>/name_wordforms.txt"
},
"english_stemmer": {
"type": "stemmer",
"language": "english"
}
}
}
}
},
"mappings": {
"dynamic": "strict",
"properties": {
"product_number": {
"type": "text",
"analyzer": "keyword"
},
"product_name": {
"type": "text",
"analyzer": "cust_case_name"
},
"first_fetch_date": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||yyyy-MM||yyyy"
},
"last_fetch_date": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||yyyy-MM||yyyy"
},
"review": {
"type": "nested",
"properties": {
"text": {
"type": "text",
"analyzer": "cust_text"
},
"review_date": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||yyyy-MM||yyyy"
}
}
}
}
},
"aliases": {
"all_products": {}
}
},
"priority": 200,
"version": 1,
}
如果我在评论文本中搜索任何特定术语,则响应会花费太多时间。
{
"_source":{
"excludes":["review"]
},
"size":1,
"track_total_hits":true,
"query":{
"nested":{
"path":"review",
"query":{
"match":{
"review.text":{
"query":"good",
"zero_terms_query":"none"
}
}
}
}
},
"highlight":{
"pre_tags":[
"<b>"
],
"post_tags":[
"</b>"
],
"fields":{
"product_name":{
}
}
}
}
我确定我遗漏了一些明显的东西。