lucene - 如何找出elasticsearch解析query_string的结果？

Question

有没有办法通过elasticsearch API找出查询字符串查询的实际解析方式？您可以通过查看lucene 查询语法来手动执行此操作，但如果您可以查看解析器实际结果的一些表示，那就太好了。

score 5 · Accepted Answer

正如评论中提到的 javanna 有_validate api。这是我的本地弹性（1.6版）的工作原理：

curl -XGET 'http://localhost:9201/pl/_validate/query?explain&pretty' -d'
{
  "query": {
      "query_string": {
      "query": "a OR (b AND c) OR (d AND NOT(e or f))",
      "default_field": "t"
    }
  }
}
'

pl是我的集群上的索引名称。不同的索引可能有不同的分析器，这就是查询验证在索引范围内执行的原因。

上述 curl 的结果如下：

{
  "valid" : true,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "explanations" : [ {
    "index" : "pl",
    "valid" : true,
    "explanation" : "filtered(t:a (+t:b +t:c) (+t:d -(t:e t:or t:f)))->cache(org.elasticsearch.index.search.nested.NonNestedDocsFilter@ce2d82f1)"
  } ]
}

我OR故意做了一个小写字母，正如您在解释中看到的那样，它被解释为一个标记而不是一个运算符。

至于解释的解释。格式类似于查询的+- 运算符query string：

( 和 ) 字符开始和结束bool query
+ 前缀表示将在must
- 前缀表示将在must_not
没有前缀意味着它将在should（default_operator等于OR）

所以上面将等同于以下内容：

{
  "bool" : {
    "should" : [
      {
        "term" : { "t" : "a" }
      },
      {
        "bool": {
          "must": [
            {
              "term" : { "t" : "b" }
            },
            {
              "term" : { "t" : "c" }
            }
          ]
        }
      },
      {
        "bool": {
          "must": {
              "term" : { "t" : "d" }
          },
          "must_not": {
            "bool": {
              "should": [
                {
                  "term" : { "t" : "e" }
                },
                {
                  "term" : { "t" : "or" }
                },
                {
                  "term" : { "t" : "f" }
                }
              ]
            }
          }
        }
      }
    ]
  }
}

我大量使用api 来调试具有许多条件_validate的复杂查询。filtered如果您想检查分析器如何标记化输入（如 url）或是否缓存了某些过滤器，则它特别有用。

还有一个很棒的参数rewrite，我直到现在才知道，它使解释更加详细，显示了将要执行的实际 Lucene 查询。

lucene - 如何找出elasticsearch解析query_string的结果？

1 回答 1

Related

Reference