python - 在 Python 中联合多个嵌套 JSON

Question

我有多个包含需要合并的关系数据的 json 文件，每个文件都有一条记录，其中 commonkey 是所有文件中的公共键，在下面的示例中 a0 ，a1 是公共键。值是多个嵌套字典Key1、key2 等键如下所示，我需要合并多个 json 文件并得到输出，如 dboutput.json 所示，文件名作为合并操作中的索引。这样的问题是一个相关的问题，它合并了丢失的信息，但在我的情况下，我不想要任何替换现有键或跳过更新的更新，以防点击现有键，创建另一个由文件名索引的嵌套字典，如下所示：

例子：

文件 db1.json：

“a0”：{
        “公鸡”：[
            "a1",
            “父键值 1”
        ],
        "key1": "kvalue1",
        “key2”：“kvalue2”
        “keyp”：“kvalue2abc”

    },
“a1”：{
...
}

文件 db2.json：

“a0”：{
        “公鸡”：[
            "a1",
            “父键值 1”
        ],
        "key1": "kvalue1xyz",
        "key2": "kvalue2",
        “key3”：“kvalue2”



    },

“a1”：{
...
}

期望的输出

文件 dboutput.json

“a0”：{
        “公鸡”：[
            "a1",
            “父键值 1”
        ],
        "key1": {"db1":"kvalue1","db2":"kvalue1xyz"} ,
        "key2": {"db1":"kvalue2","db2":"kvalue2"} ,
        "key3": {"db2":"kvalue2"}
        “keyp”：{“db1”：“kvalue2abc”}



    },
“a1”：{
...
}

那么如何进行这种无损合并呢？注意 "key2": {"db1":"kvalue2","db2":"kvalue2"} 即使 key\value 对相同，它们也需要单独存储。实际上，输出是所有输入文件的联合，并且包含来自所有其他文件的所有条目。

还

"commonkey": [
            "a1", 
            "parentkeyvalue1"
        ],

所有文件都相同，因此无需重复

score 2 · Accepted Answer

我终于设法得到它：

class NestedDict(collections.OrderedDict):
    """Implementation of perl's autovivification feature."""
    def __getitem__(self, item):
        try:
            return dict.__getitem__(self, item)
        except KeyError:
            value = self[item] = type(self)()
            return value

def mergejsons(jsns):
 ##use auto vification Nested Dict
    op=nesteddict.NestedDict()
    for j in jsns:
        jdata=json.load(open(j))
        jname=j.split('.')[0][-2:]
        for commnkey,val in jdata.items():
            for k,v in val.items():
                if k!='commonkey':
                    op[commnkey][k][jname]=v
                if  op[commnkey].has_key('commonkey'):
                    continue
                else:
                    op[commnkey][k][jname]=v

score 1 · Accepted Answer

一个简单的解决方案是遍历每个 JSON 对象，并在您看到的每个“公共密钥”中添加字典对。这是一个示例，您将每个 JSON 文件加载到列表中，然后迭代地合并它们。

#!/usr/bin/python
import json

# Hardcoded list of JSON files
dbs = [ "db1.json", "db2.json" ]
output = dict() # stores all the merged output

for db in dbs:
    # Name the JSON obj and load it 
    db_name = db.split(".json")[0]
    obj = json.load(open(db))

    # Iterate through the common keys, adding them only if they're new
    for common_key, data in obj.items():
        if common_key not in output:
            output[common_key] = dict(commonkey=data["commonkey"])

        # Within each common key, add key, val pairs 
        # subindexed by the database name
        for key, val in data.items():
            if key != "commonkey":
                if key in output[common_key]:
                    output[common_key][key][db_name] = val
                else:
                    output[common_key][key] = {db_name: val}


# Output resulting json to file
open("dboutput.json", "w").write( 
    json.dumps( output, sort_keys=True, indent=4, separators=(',', ': ') )
)

python - 在 Python 中联合多个嵌套 JSON

2 回答 2

Related

Reference