elasticsearch - Elasticsearch 精确匹配或查询

Question

我的索引中有这样的文件：

{
  "field" : "a, b, c, d, e"
}

字段值是由数组到字符串函数制成的字符串。因此，并非每个文档都具有相同的字符串，但每个文档都至少具有相同"a, b"的值。

现在我想要一个匹配两种文档的查询：

仅（完全）"a, b"作为字段值的文档或在该字段中包含至少两个搜索词的文档。

基本上我的问题是，如果字段被分析，我无法满足第一个条件，如果字段未被分析，我无法满足第二个条件。有没有将字段克隆为 not_alanyzed 的解决方案？

如果我将该字段克隆到未分析的字段（在代码示例 field1 中），我可以使用此查询。我觉得这个查询对于成就来说太复杂了......：

{
  "query": {
    "filtered": {
      "query": {
        "match_all": {}
      },
      "filter": {
        "or": [
          {
            "term": {
              "field1": "a, b"
            }
          },
          {
            "and": [
              {
                "term": {
                  "field": "c"
                }
              },
              {
                "term": {
                  "field1": "d"
                }
              }
            ]
          }
        ]
      }
    }
  }
}

score 5 · Accepted Answer

您可以使用多字段映射。这允许一个字段被发送一次，但以两种不同的方式进行分析。

"properties": {
  "field" {
    "type": "multi_field",
      "fields" : {
        "field" : {"type" : "string", "index" : "analyzed"},
        "raw" : {"type" : "string", "index" : "not_analyzed"}
    }
  }
}

像往常一样将文档发送到elasticsearch（它将在两个地方被索引，field（或field.field）和field.raw

现在您的查询将如下所示：

{
  "query": {
    "filtered": {
      "query": {
        "match_all": {}
      },
      "filter": {
        "or": [
          {
            "term": {
              "field.raw": "a, b"
            }
          },
          {
            "and": [
              {
                "term": {
                  "field": "c"
                }
              },
              {
                "term": {
                  "field": "d"
                }
              }
            ]
          }
        ]
      }
    }
  }
}

这不是最优雅的解决方案。我更愿意改变你存储数据的方式。似乎“a，b”代表不同的东西，可能在文档上有一个布尔字段“a_b_only”来过滤。

祝你好运，请随时寻求更多帮助！

score 3 · Accepted Answer

Elasticsearch 版本 1.X 不支持 multi_fields，而是使用

"title" :{ 
           "type" : "string",
            "raw" : {"type" :"string" , "index" :"not_analyzed" 
         }

更多信息请阅读Elasticsearch 1.7 Docs on Multi-fields。

score 1 · Accepted Answer

出于好奇，你为什么首先从你的数组中创建一个字符串？ES 文档中的一个字段可以包含多个值，您可以使用“术语”过滤器查询它们：http ://www.elasticsearch.org/guide/reference/query-dsl/terms-filter/ 。因此，而不是您的原始字段数据：

{
  "field1" : "a, b, c, d, e"
}

您只需将其保存在一个数组中，如下所示：

{
  "field1" : ["a", "b", "c", "d", "e"]
}

然后你会查询这样的东西（注意，这是未经测试的！）：

{
  "query": {
    "filtered": {
      "query": {
        "match_all": {}
      },
      "filter": {
        "or": [
          {
            "terms": {
              "field1": ["a", "b"],
              "execution": "and"
            }
          },
          {
            "terms": {
              "field1": ["c", "d"],
              "execution": "and"
            }
          }
        ]
      }
    }
  }
}

最后一点，我认为您的真实数据要求将“field1”设置为“not_analyzed”。

elasticsearch - Elasticsearch 精确匹配或查询

3 回答 3

Related

Reference