4

根据http://nest.azurewebsites.net/concepts/writing-queries.html, && 和 || 运算符可用于使用 NEST 库组合两个查询以与 Elastic Search 进行通信。

我设置了以下查询:

var ssnQuery = Query<NameOnRecordDTO>.Match(
                q => q.OnField(f => f.SocialSecurityNumber).QueryString(nameOnRecord.SocialSecurityNumber).Fuzziness(0)
            );

然后将其与 Bool 查询相结合,如下所示:

var result = client.Search<NameOnRecordDTO>(
     body => body.Query(
          query => query.Bool(
              bq => bq.Should(
                  q => q.Match(
                     p => p.OnField(f => f.Name.First)
                         .QueryString(nameOnRecord.Name.First).Fuzziness(fuzziness)
                  ),
                  q => q.Match(p => p.OnField(f => f.Name.Last)
                         .QueryString(nameOnRecord.Name.Last).Fuzziness(fuzziness)
                  )
              ).MinimumNumberShouldMatch(2)
          ) || ssnQuery
     )
);

我认为这个查询的意思是,如果匹配,SocialSecurityNumber或者两个字段都匹配,那么记录应该包含在结果中。Name.FirstName.Last

当我使用调用 QueryString 中使用的 nameOnRecord 对象的以下数据执行此查询时:

"socialSecurityNumber":"123456789",
    "name" : {
      "first":"ryan",          
    }

结果是有 SSN 的人123456789,以及有名字的任何人ryan

如果我|| ssnQuery从上面的查询中删除,我会得到名字是“ryan”的每个人。

使用|| ssnQuery到位和以下查询:

{
    "socialSecurityNumber":"123456789",
    "name" : {
      "first":"ryan",
      "last": "smith"
    }        
}

我似乎找到了 SSN 为 123456789 的人以及名字为“ryan”或姓氏为“smith”的人。

因此,添加似乎没有|| ssnQuery达到我预期的效果,我也不知道为什么。

以下是相关对象索引的定义:

"nameonrecord" : {
    "properties": {      
        "name": {
            "properties": {
                "name.first": {
                    "type": "string"
                 },
                 "name.last": {
                    "type": "string"
                 }
             }   
        },
        "address" : {
            "properties": {
                "address.address1": {
                    "type": "string",
                     "index_analyzer": "address",
                     "search_analyzer": "address"
                 },
                "address.address2": {
                    "type": "string",
                    "analyzer": "address"
                 },
                 "address.city" : {
                    "type": "string", 
                    "analyzer": "standard"
                 },
                 "address.state" : {
                    "type": "string",
                    "analyzer": "standard"
                 },
                 "address.zip" : {
                    "type" : "string",
                    "analyzer": "standard"
                 }
            }   
        },                
        "otherName": {
           "type": "string"
        },
        "socialSecurityNumber" : {
           "type": "string"   
        },
        "contactInfo" : {
           "properties": {
                "contactInfo.phone": {
                    "type": "string"
                },
                "contactInfo.email": {
                    "type": "string"
                }
            }
        }                
     }   
}

我认为address分析器的定义并不重要,因为地址字段没有在查询中使用,但如果有人想看到它可以包含它。

4

1 回答 1

14

这实际上是 NEST 中的一个错误

NEST 如何帮助翻译布尔查询的先驱:

NEST 允许您使用运算符重载轻松创建详细的布尔查询/过滤器,即:

term && term 将导致:

bool
    must
        term
        term

一个天真的实现会重写

term && term && term

bool
    must
        term
        bool
            must
                term
                term

正如您所想象的那样,查询变得越复杂,NEST 可以发现这些并将它们连接在一起成为

bool
    must 
        term
        term
        term

同样term && term && term && !term简单地变成:

bool
    must 
        term
        term
        term
    must_not
        term

现在如果在前面的例子中你像这样直接传入一个布尔查询

bool(must=term, term, term) && !term

它仍然会生成相同的查询。should当 NEST看到正在播放的布尔描述符仅包含 时,它也会对 's 执行相同的操作should clauses。这是因为 boolquery 并不完全遵循您对编程语言所期望的相同的布尔逻辑。

总结后者:

term || term || term

变成

bool
    should
        term
        term
        term

term1 && (term2 || term3 || term4)不会变成

bool
    must 
        term1
    should
        term2
        term3
        term4

This is because as soon as a boolean query has a must clause the should start acting as a boosting factor. So in the previous you could get back results that ONLY contain term1 this is clearly not what you want in the strict boolean sense of the input.

NEST therefor rewrites this query to

bool 
    must 
        term1
        bool
            should
                term2
                term3
                term4

Now where the bug came into play was that your situation you have this

bool(should=term1, term2, minimum_should_match=2) || term3 NEST identified both sides of the OR operation only contains should clauses and it would join them together which would give a different meaning to the minimum_should_match parameter of the first boolean query.

I just pushed a fix for this and this will be fixed in the next release 0.11.8.0

Thanks for catching this one!

于 2013-11-15T15:15:41.987 回答