mongodb - ElasticSearch 自动完成返回 0 次点击

Question

我正在尝试为我们在 MongoDB 上运行的数据库构建一个自动完成功能。我们需要提供自动完成功能，让用户在输入搜索框时通过提供建议来完成他们的查询。

我收集了articles来自各种来源的集合，其中包含以下字段：

{
    "title" : "Its the title of a random article",
    "cont" : {  "paragraphs" : [ .... ]  },
    and so on..
}

我浏览了Clinton Gormley 的视频。从 37:00 到 42:00 分钟，Gormley 描述了一个使用edgeNGram. 另外，我提到这个问题是为了认识到两者几乎是一样的，只是映射不同。

所以基于这些经验，我构建了几乎相同的设置和映射，然后恢复articles集合以确保它被 ElasticSearch 索引

索引方案如下：

POST /title_autocomplete/title
{
    "settings": {

        "analysis": {
            "filter": {
                "autocomplete": {
                    "type": "edgeNGram",
                    "min_gram": 2,
                    "max_gram": 50
                }
            },

            "analyzer": {

                "title" : {
                    "type" : "standard",
                    "stopwords":[]
                },
                "autocomplete": {
                    "type" : "autocomplete",
                     "tokenizer": "standard",
                     "filter": ["lowercase", "autocomplete"]
                }
             }
        }
    },
    "mappings": {
        "title": {
            "type": "multi_field",
            "fields" :  {
                "title" : {
                    "type": "string",
                    "analyzer": "title"
                },
                "autocomplete" : {
                    "type": "string",
                    "index_analyzer": "autocomplete",
                    "search_analyzer" : "title"
                }
            }
       }
    }
}

但是当我运行搜索查询时，我无法获得任何点击！

GET /title_autocomplete/title/_search
{
    "query": {
        "bool" : {
            "must" : {
                "match" : {
                    "title.autocomplete" : "Its the titl"
                }
            },
            "should" : {
                "match" : {
                    "title" : "Its the titl"
                }
            }
        }
    }
}

谁能解释一下映射查询或设置有什么问题？我已经阅读 ElasticSearch 文档超过 7 天了，但似乎没有比全文搜索更多的东西了！

ElastiSearch 版本：0.90.10
MongoDB版本：v2.4.9
使用_river
Ubuntu 12.04 64 位

更新我意识到在应用以前的设置后映射被搞砸了：

GET /title_autocomplete/_mapping
{
   "title_autocomplete": {
      "title": {
         "properties": {
            "analysis": {
               "properties": {
                  "analyzer": {
                     "properties": {
                        "autocomplete": {
                           "properties": {
                              "filter": {
                                 "type": "string"
                              },
                              "tokenizer": {
                                 "type": "string"
                              },
                              "type": {
                                 "type": "string"
                              }
                           }
                        },
                        "title": {
                           "properties": {
                              "type": {
                                 "type": "string"
                              }
                           }
                        }
                     }
                  },
                  "filter": {
                     "properties": {
                        "autocomplete": {
                           "properties": {
                              "max_gram": {
                                 "type": "long"
                              },
                              "min_gram": {
                                 "type": "long"
                              },
                              "type": {
                                 "type": "string"
                              }
                           }
                        }
                     }
                  }
               }
            },
            "content": { 
                  ... paras and all  ...
            }
            "title": {
               "type": "string"
            },
            "url": {
               "type": "string"
            }
         }
      }
   }
}

应用设置后，分析器和过滤器实际上映射到文档中，而原始title字段根本不受影响！这正常吗？？我想这解释了为什么查询不匹配。根本没有title.autocomplete领域或title.title领域。

那么我现在应该如何进行呢？

score 0 · Accepted Answer

对于那些面临这个问题的人，最好删除索引并重新开始，而不是像DrTech 在评论中指出的那样在 _river上浪费时间。

这可以节省时间，但不是解决方案。（因此不将其标记为答案。）

score 0 · Accepted Answer

关键是在启动河流之前设置映射和索引。

我们有一个现有的设置，其中包含一个 mongodb 河流和一个名为 coresearch 的索引，我们希望向其添加自动完成容量，这是我们用来删除现有索引和河流并重新开始的一组命令。

堆栈是：

弹性搜索 1.1.1
MongoDB 2.4.9
ElasticSearchMapperAttachments v2.0.0
ElasticSearchRiverMongoDb/2.0.0
Ubuntu 12.04.2 LTS

curl -XDELETE "localhost:9200/_river/node" curl -XDELETE "localhost:9200/coresearch"

curl -XPUT "localhost:9200/coresearch" -d ' { "settings": { "analysis": { "filter": { "autocomplete_filter": { "type": "edge_ngram", "min_gram": 1, "max_gram ": 20 } }, "analyzer": { "autocomplete": { "type": "custom", "tokenizer": "standard", "filter": [ "lowercase", "autocomplete_filter" ] } } } } } '

curl -XPUT "localhost:9200/coresearch/_mapping/users" -d '{ "users": { "properties": { "firstname": { "type": "string", "search_analyzer": "standard", " index_analyzer": "autocomplete" }, "lastname": { "type": "string", "search_analyzer": "standard", "index_analyzer": "autocomplete" }, "username": { "type": "string" ，“search_analyzer”：“标准”，“index_analyzer”：“自动完成”}，“电子邮件”：{“类型”：“字符串”，“search_analyzer”：“标准”，“index_analyzer”：“自动完成”} } } }'

curl -XPUT "localhost:9200/_river/node/_meta" -d ' { "type": "mongodb", "mongodb": { "servers": [ { "host": "127.0.0.1", "port" : 27017 } ], "选项":{ "exclude_fields": ["time"] },
```
"db": "users",  
"gridfs": false,
"options": {
  "import_all_collections": true
}
},
"index": {
  "name": "coresearch",
  "type": "documents"
}
```
}'

mongodb - ElasticSearch 自动完成返回 0 次点击

2 回答 2

Related

Reference