amazon-web-services - 弹性搜索语音分析器返回零结果？

Question

我使用 ES 语音分析器得到 0 个结果。

使用 AWS 中的内置插件 - https://aws.amazon.com/about-aws/whats-new/2016/12/amazon-elasticsearch-service-now-supports-phonetic-analysis/。

在索引之前，我使用此代码来设置语音分析器。

PUT enpoint/courts_2
{
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "my_analyzer": {
            "tokenizer": "standard",
            "filter": [
              "lowercase",
              "my_metaphone"
            ]
          }
        },
        "filter": {
          "my_metaphone": {
            "type": "phonetic",
            "encoder": "metaphone",
            "replace": true
          }
        }
      }
    }
  }
}

注意：我没有专门下载它，因为 AWS 已经预先构建了它（检查上面的链接）。现在，我正在使用此代码对端点进行查询 -

{
    "query": {
        "multi_match" : {
            "query" : "Abhijith",
            "fields" : ["content", "title^10"],
             "analyzer": "my_analyzer"


        }
    },
     "size": "1",
     "_source": [ "title", "bench", "court" ],
     "highlight": {
        "fields" : {
            "title" : {},
            "content":{}
        }
    }

}

但我得到的结果为零。我得到以下输出：

{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 0,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    }
}

我可以确认，当不使用分析仪时，我得到了回击。

当我使用此代码时，它会返回正常输出。

GET courts_2/_analyze
{
  "analyzer": "my_analyzer",
  "text": "Abhijith"
}

{
    "tokens": [
        {
            "token": "ABHJ",
            "start_offset": 0,
            "end_offset": 8,
            "type": "<ALPHANUM>",
            "position": 0
        }
    ]
}

索引映射

{
    "courts_2": {
        "mappings": {
            "properties": {
                "author": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "bench": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "citation": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "content": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "court": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "date": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "id_": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "title": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "verdict": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                }
            }
        }
    }
}

score 1 · Accepted Answer

您似乎没有为您的 court_2 索引指定映射。所以所有文本字段都使用标准分析器进行索引。

因此语音标记没有被索引，因此它们在查询时无法匹配。

要配置您的文本字段以使用您的分析器，您需要使用这样的映射

PUT enpoint/courts_2
{
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "my_analyzer": {
            "tokenizer": "standard",
            "filter": [
              "lowercase",
              "my_metaphone"
            ]
          }
        },
        "filter": {
          "my_metaphone": {
            "type": "phonetic",
            "encoder": "metaphone",
            "replace": true
          }
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "content": {
        "type": "text",
        "analyzer": "my_analyzer"
      },
      "title": {
        "type": "text",
        "analyzer": "my_analyzer"
      }
    }
  }
}

这里是关于映射参数的文档

问候。

amazon-web-services - 弹性搜索语音分析器返回零结果？

1 回答 1

Related

Reference