elasticsearch - 弹性搜索嵌套 match_phrase 问题

Question

我们正在对嵌套对象进行 match_phrase 查询，其中嵌套对象只有一个字符串值。

我们打算找到字符串短语的出现。

让我们假设，

1) 映射如下。

"attr": {
                "type": "nested",
                "properties": {
                    "attr": {
                        "type": "multi_field",
                        "fields": {
                            "attr": { "type": "string", "index": "analyzed", "include_in_all": true, "analyzer": "keyword" },
                            "untouched": { "type": "string", "index": "analyzed", "include_in_all": false, "analyzer": "not_analyzed" }
                        }
                    }
                }
            }

2）数据就像。

对象 A：

"attr": [
    {
        "attr": "beverage"
    },
    {
        "attr": "apple wine"
    }
]

对象 B：

"attr": [
    {
        "attr": "beverage"
    },
    {
        "attr": "apple"
    },
    {
        "attr": "wine"
    }
]

3）因此，在查询中

{
    "query": {
        "match": {
            "_all": {
                "query": "apple wine",
                "type": "phrase"
                }
            }
        }
    }

我们只期待对象 A，但不幸的是对象 B 也即将到来。

请期待您的建议。

score 0 · Accepted Answer

您还需要告诉查询在一个嵌套文档中搜索所有术语：

"query": {
  "nested": {
    "path": "attr",
    "query": {
      "match": {
        "attr": {
          "query": "apple wine",
          "operator": "and"
        }
      }
    }
  }
}

一个很好的信息来源是http://www.spacevatican.org/2012/6/3/fun-with-elasticsearch-s-children-and-nested-documents/

score 0 · Accepted Answer

在您的情况下，单独的数组值的偏移量应该有很大的差距，以避免短语匹配。同一字段的实例之间存在默认的可配置间隙，但此间隙的默认值为 0。

您应该在字段映射中更改它：

"attr": { "type": "string", 
"index": "analyzed", 
"include_in_all": true, 
"analyzer": "keyword", 
"position_offset_gap": 100 
}

elasticsearch - 弹性搜索嵌套 match_phrase 问题

2 回答 2

Related

Reference