5

我正在尝试使 nGrams 和同义词等功能正常工作,但我没有任何运气。

我正在关注这篇博文。我已经尝试将映射和查询调整到我的数据,它只会匹配确切的术语。我还尝试使用来自此要点的文章中的确切数据,结果相同。

这是映射:

{
   "mappings": {
      "item": {
         "properties": {
            "productName": {
               "fields": {
                  "partial": {
                     "search_analyzer":"full_name",
                     "index_analyzer":"partial_name",
                     "type":"string"
                  },
                  "partial_back": {
                     "search_analyzer":"full_name",
                     "index_analyzer":"partial_name_back",
                     "type":"string"
                  },
                  "partial_middle": {
                     "search_analyzer":"full_name",
                     "index_analyzer":"partial_middle_name",
                     "type":"string"
                  },
                  "productName": {
                     "type":"string",
                     "analyzer":"full_name"
                  }
               },
               "type":"multi_field"
            },
            "productID": {
               "type":"string",
               "analyzer":"simple"
            },
            "warehouse": {
               "type":"string",
               "analyzer":"simple"
            },
            "vendor": {
               "type":"string",
               "analyzer":"simple"
            },
            "productDescription": {
               "type":"string",
               "analyzer":"full_name"
            },
            "categories": {
               "type":"string",
               "analyzer":"simple"
            },
            "stockLevel": {
               "type":"integer",
               "index":"not_analyzed"
            },
            "cost": {
               "type":"float",
               "index":"not_analyzed"
            }
         }
      },
      "settings": {
         "analysis": {
            "filter": {
               "name_ngrams": {
                  "side":"front",
                  "max_gram":50,
                  "min_gram":2,
                  "type":"edgeNGram"
               },
               "name_ngrams_back": {
                  "side":"back",
                  "max_gram":50,
                  "min_gram":2,
                  "type":"edgeNGram"
               },
               "name_middle_ngrams": {
                  "type":"nGram",
                  "max_gram":50,
                  "min_gram":2
               }
            },
            "analyzer": {
               "full_name": {
                  "filter":[
                     "standard",
                     "lowercase",
                     "asciifolding"
                  ],
                  "type":"custom",
                  "tokenizer":"standard"
               },
               "partial_name": {
                  "filter":[
                     "standard",
                     "lowercase",
                     "asciifolding",
                     "name_ngrams"
                  ],
                  "type":"custom",
                  "tokenizer":"standard"
               },
               "partial_name_back": {
                  "filter":[
                     "standard",
                     "lowercase",
                     "asciifolding",
                     "name_ngrams_back"
                  ],
                  "type":"custom",
                  "tokenizer":"standard"
               },
               "partial_middle_name": {
                  "filter":[
                     "standard",
                     "lowercase",
                     "asciifolding",
                     "name_middle_ngrams"
                  ],
                  "type":"custom",
                  "tokenizer":"standard"
               }
            }
         }
      }
   }
}

和搜索查询(我删除了过滤器以尝试返回更多结果):

{
   "size":20,
   "from":0,
   "sort":[
      "_score"
   ],
   "query": {
      "bool": {
         "should":[
            {
               "text": {
                  "productName": {
                     "boost":5,
                     "query":"test query",
                     "type":"phrase"
                  }
               }
            },
            {
               "text": {
                  "productName.partial": {
                     "boost":1,
                     "query":"test query"
                  }
               }
            },
            {
               "text": {
                  "productName.partial_middle": {
                     "boost":1,
                     "query":"test query"
                  }
               }
            },
            {
               "text": {
                  "productName.partial_back": {
                     "boost":1,
                     "query":"test query"
                  }
               }
            }
         ]
      }
   }
}

使用上面的查询,如果我从第一个 bool 查询中删除以下代码

"text":{
    "productName":{
        "boost":5,
        "query":"test query",
        "type":"phrase"
    }
} 

所以它不会返回直接匹配,无论我的搜索词是什么,我仍然没有返回任何结果。

我想我遗漏了一些非常明显的东西,并且真的不知道还有哪些其他信息是相关的,所以请放轻松。

4

1 回答 1

5

看起来我找到了问题的答案,盲目复制和粘贴。我链接到的博客文章似乎已过时,命令的 JSON 不再正常工作(但在发送命令时没有抛出错误)。

这是创建我使用的索引的代码:

{
   "settings": {
      "analysis": {
         "filter": {
            "name_ngrams": {
               "side":"front",
               "max_gram":50,
               "min_gram":2,
               "type":"edgeNGram"
            },
            "name_ngrams_back": {
               "side":"back",
               "max_gram":50,
               "min_gram":2,
               "type":"edgeNGram"
            },
            "name_middle_ngrams": {
               "type":"nGram",
               "max_gram":50,
               "min_gram":2
            }
         },
         "analyzer": {
            "full_name": {
               "filter":[
                  "standard",
                  "lowercase",
                  "asciifolding"
               ],
               "type":"custom",
               "tokenizer":"standard"
            },
            "partial_name": {
               "filter":[
                  "standard",
                  "lowercase",
                  "asciifolding",
                  "name_ngrams"
               ],
               "type":"custom",
               "tokenizer":"standard"
            },
            "partial_name_back": {
               "filter":[
                  "standard",
                  "lowercase",
                  "asciifolding",
                  "name_ngrams_back"
               ],
               "type":"custom",
               "tokenizer":"standard"
            },
            "partial_middle_name": {
               "filter":[
                  "standard",
                  "lowercase",
                  "asciifolding",
                  "name_middle_ngrams"
               ],
               "type":"custom",
               "tokenizer":"standard"
            }
         }
      }
   },
   "mappings" : {
      "product": {
         "properties": {
            "productName": {
               "fields": {
                  "partial": {
                     "search_analyzer":"full_name",
                     "index_analyzer":"partial_name",
                     "type":"string"
                  },
                  "partial_back": {
                     "search_analyzer":"full_name",
                     "index_analyzer":"partial_name_back",
                     "type":"string"
                  },
                  "partial_middle": {
                     "search_analyzer":"full_name",
                     "index_analyzer":"partial_middle_name",
                     "type":"string"
                  },
                  "productName": {
                     "type":"string",
                     "analyzer":"full_name"
                  }
               },
               "type":"multi_field"
            },
            "productID": {
               "type":"string",
               "analyzer":"simple"
            },
            "warehouse": {
               "type":"string",
               "analyzer":"simple"
            },
            "vendor": {
               "type":"string",
               "analyzer":"simple"
            },
            "productDescription": {
               "type":"string",
               "analyzer":"full_name"
            },
            "categories": {
               "type":"string",
               "analyzer":"simple"
            },
            "stockLevel": {
               "type":"integer",
               "index":"not_analyzed"
            },
            "cost": {
               "type":"float",
               "index":"not_analyzed"
            }
         }
      }
   }
}

这是我用来插入测试记录的代码(我使用了 3 次,数据略有变化)

{
    "productName": "Thingey",
    "productID": "asdfasef9816",
    "warehouse": "usa",
    "vendor": "Cool Things Inc",
    "productDescription": "This is a cool gizmo",
    "categories": "Cool Things",
    "stockLevel": 6,
    "cost": 15.31
}

最后是用于搜索查询的 JSON。

{
   "size":20,
   "from":0,
   "sort":[
      "_score"
   ],
   "query": {
      "bool": {
         "should":[
            {
               "text": {
                  "productName.partial": {
                     "boost":1,
                     "query":"ing"
                  }
               }
            },
            {
               "text": {
                  "productName.partial_middle": {
                     "boost":1,
                     "query":"ing"
                  }
               }
            },
            {
               "text": {
                  "productName.partial_back": {
                     "boost":1,
                     "query":"ing"
                  }
               }
            }
         ]
      }
   }
}

我必须做的关键更改是将设置从映射 PUT 移动到索引创建。我还在这里移动了初始映射定义,但它可以使用常规 /index/item/_mapping PUT 创建。

如果有任何 ElasticSearch 专业人士想为本期的未来读者扩展此内容,请这样做。

于 2013-09-08T20:04:53.390 回答