elasticsearch - 如何在elasticsearch中存储关系数据

Question

在 elasticsearch 中存储关系数据的选项有哪些。我知道以下方法

嵌套对象：- 我不想以嵌套格式存储数据，因为我想在不更改另一个文档的情况下更新一个文档，如果我使用嵌套对象，那么父文档中将重复子数据。
父子：- 我不想将数据存储在单个索引中，但是为了使用父子数据，需要存在于一个索引中（不同类型）。我知道此限制将在未来版本中删除，如https://github.com/elastic/elasticsearch/issues/15613问题中所述，但我想要一个适用于 5.5 版本的解决方案。

除了上述之外，还有其他方法吗？

score 8 · Accepted Answer

嵌套对象是一种完美的方法。如果正确更新子对象，则父文档中的子对象不会重复。我在我的一个用例中使用相同的方法，我需要维护主子一对多关系的关系数据。我已经为Update API编写了一个Painless 脚本来添加和更新父文档中现有的嵌套子对象，而不会创建重复或重复的条目。

更新答案：

下面是嵌入嵌套类型文档“子”的父子嵌套类型文档的结构。

{
    "parent_id": 1,
    "parent_name": "ABC",
    "parent_number": 123,
    "parent_addr": "123 6th St. Melbourne, FL 32904"
    "childs": [
      {
        "child_id": 1,
        "child_name": "PQR",
        "child_number": 456,
        "child_age": 10
      },
      {
        "child_id": 2,
        "child_name": "XYZ",
        "child_number": 789,
        "child_age": 12
      },
      {
        "child_id": 3,
        "child_name": "QWE",
        "child_number": 234,
        "child_age": 16
      }

    ]   
}

映射如下：

PUT parent/
{
  "parent": {
    "mappings": {
      "parent": {
        "properties": {
          "parent_id": {
            "type": "long"
          },
          "parent_name": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "parent_number": {
            "type": "long"
          },
          "parent_addr": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "child_tickets": {
            "type": "nested",
            "properties": {
              "child_id": {
                "type": "long"
              },
              "child_name": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              },
              "child_number": {
                "type": "long"
              },
              "child_age": {
                "type": "long"
              }
            }
          }
        }
      }
    }
  }
}

在 RDMS 中，这两个实体（父、子）是两个不同的表，父 -> 子之间具有一对多关系。Parent 的 id 是 Child 行的外键。（两个表都必须有 id）

现在在 Elasticsearch 中，要索引父文档，我们必须有 id 来索引它，在这种情况下它是 parent_id。索引父文档查询（parent_id 是我正在谈论的 id 并已索引 id(_id) = 1 的文档）：

POST parent/parent/1
{
    "parent_id": 1,
    "parent_name": "ABC",
    "parent_number": 123,
    "parent_addr": "123 6th St. Melbourne, FL 32904"
}

现在，将子级添加到父级。为此，您将需要包含子 ID 和父 ID 的子文档。要添加孩子，必须有父 ID。以下是添加新子代或更新已存在子代的更新查询。

POST parent/parent/1/_update
{
    "script":{
    "lang":"painless",
    "inline":"if (!ctx._source.containsKey(\"childs\")) {
                ctx._source.childs = [];
                ctx._source.childs.add(params.child);
            } else {
                int flag=0;
                for(int i=0;i<ctx._source.childs.size();i++){
                    if(ctx._source.childs[i].child_id==params.child.child_id){
                        ctx._source.childs[i]=params.child;
                        flag++;
                    }
                }
                if(flag==0){
                    ctx._source.childs.add(params.child);
                }
            }",
    "params":{
        "child":{
                "child_id": 1,
                "child_name": "PQR",
                "child_number": 456,
                "child_age": 10
            }
        }
    }
}

试一试。干杯!

需要帮助请叫我。

score 4 · Accepted Answer

还有两种方法：非规范化和对连接运行多个查询。

非规范化会占用更多空间并增加您的写入时间，但您只需运行一个查询即可检索数据，因此您的读取时间会有所改善。由于您不想将数据存储在单个索引中，因此加入可能会对您有所帮助。

elasticsearch - 如何在elasticsearch中存储关系数据

2 回答 2

Related

Reference