1

我在 elasticsearch 中有文档,如果任何附件不包含 uuid 或 uuid 为空,我无法理解如何应用应该返回文档的搜索脚本。弹性 5.2 版本。文件映射

"mappings": {
    "documentType": {
        "properties": {
            "attachment": {
                "properties": {
                    "uuid": {
                        "type": "text"
                    },
                    "path": {
                        "type": "text"
                    },
                    "size": {
                        "type": "long"
                    }
                }
            }}}

在弹性搜索中,它看起来像

{
        "_index": "documents",
        "_type": "documentType",
        "_id": "1",
        "_score": 1.0,
        "_source": {
          "attachment": [
               {
                "uuid": "21321321",
                "path": "../uploads/somepath",
                "size":1231
               },
               {
                "path": "../uploads/somepath",
                "size":1231
               },      
         ]},
{
        "_index": "documents",
        "_type": "documentType",
        "_id": "2",
        "_score": 1.0,
        "_source": {
          "attachment": [
               {
                "uuid": "223645641321321",
                "path": "../uploads/somepath",
                "size":1231
               },
               {
                "uuid": "22341424321321",
                "path": "../uploads/somepath",
                "size":1231
               },        
         ]},
{
        "_index": "documents",
        "_type": "documentType",
        "_id": "3",
        "_score": 1.0,
        "_source": {
          "attachment": [
               {
                "uuid": "22789789341321321",
                "path": "../uploads/somepath",
                "size":1231
               }, 
               {
                "path": "../uploads/somepath",
                "size":1231
               },      
         ]}

结果,我想获取带有 _id 1 和 3 的附件。但是结果我得到了我尝试应用下一个脚本的脚本错误:

{
    "query": {
        "bool": {
            "must": [
                {
                    "exists": {
                        "field": "attachment"
                    }
                },
                {
                    "script": {
                        "script": {
                            "inline": "for (item in doc['attachment'].value) { if (item['uuid'] == null) { return true}}",
                            "lang": "painless"
                        }
                    }
                }
            ]
        }
    }
}

错误是下一个:

 "root_cause": [
            {
                "type": "script_exception",
                "reason": "runtime error",
                "script_stack": [
                    "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:77)",
                    "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:36)",
                    "for (item in doc['attachment'].value) { ",
                    "                 ^---- HERE"
                ],
                "script": "for (item in doc['attachment'].value) { if (item['uuid'] == null) { return true}}",
                "lang": "painless"
            }
        ],

如果一个附件对象不包含 uuid ,是否可以选择文档?

4

2 回答 2

1

迭代对象数组并不像人们想象的那么简单。我在这里这里都写了很多关于它的文章。

由于您attachments未定义为nested,因此 ES 将在内部将它们表示为扁平的值列表(也称为“文档值”)。例如attachment.uuid在 doc#2 中将变为["223645641321321", "22341424321321"]attachments.size并将变为[1231, 1231].

这意味着您可以简单地比较.length这些扁平化的表示!我假设attachment.size始终存在,因此可以作为比较基线。

还有一件事。为了利用这些优化的文本字段的文档值,它需要一个小的映射更改

PUT documents/documentType/_mappings
{
  "properties": {
    "attachment": {
      "properties": {
        "uuid": {
          "type": "text",
          "fielddata": true     <---
        },
        "path": {
          "type": "text"
        },
        "size": {
          "type": "long"
        }
      }
    }
  }
}

完成后,您已经重新索引了您的文档——这可以通过查询技巧的这个小更新来完成:

POST documents/_update_by_query

然后,您可以使用以下脚本查询:

POST documents/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "exists": {
            "field": "attachment"
          }
        },
        {
          "script": {
            "script": {
              "inline": "def size_field_length = doc['attachment.size'].length; def uuid_field_length =  doc['attachment.uuid'].length; return uuid_field_length < size_field_length",
              "lang": "painless"
            }
          }
        }
      ]
    }
  }
}
于 2021-04-14T08:29:47.350 回答
1

只是为了补充这个答案如果uuid字段的映射是自动创建的,则弹性搜索以这种方式添加它:

"uuid": {
    "type": "text",
    "fields": {
        "keyword": {
            "type": "keyword",
            "ignore_above": 256
        }
    }
}

那么脚本可能如下所示:

POST documents/_search
{
    "query": {
        "bool": {
            "must": [
                {
                    "exists": {
                        "field": "attachment"
                    }
                },
                {
                    "script": {
                        "script": {
                            "inline": "doc['attachment.size'].length > doc['attachment.uuid.keyword'].length",
                            "lang": "painless"
                        }
                    }
                }
            ]
        }
    }
}
于 2021-04-15T10:21:58.690 回答