0

我有一个数据框,我想将其转换为分层耀斑 json 以用于 D3 可视化,如下所示:D3 sunburst

我的数据框包含一个分层数据,例如:

在此处输入图像描述

我想要的输出应该是这样的:

{"name": "flare","children": 
    [
        {"name": "Animal", "children": 
            [
                {"name": "Mammal", "children":
                    [
                        {"name": "Fox","value":35000}, 
                        {"name": "Lion","value":25000}
                    ]
                },
                {"name": "Fish", "children":
                    [
                        {"name": "Cod","value":35000} 
                    ]
                }
            ]
        },
        {"name": "Plant", "children": 
            [
                {"name": "Tree", "children":
                    [
                        {"name": "Oak","value":35000} 
                    ]
                }
            ]
        }
     ]
} 

我尝试了几种方法,但无法正确处理。这是我的非工作代码,受这篇文章的启发:Pandas to D3。将数据帧序列化为 JSON

from collections import defaultdict
import pandas as pd
df = pd.DataFrame({'group1':["Animal", "Animal", "Animal", "Plant"],'group2':["Mammal", "Mammal", "Fish", "Tree"], 'group3':["Fox", "Lion", "Cod", "Oak"],'value':[35000,25000,15000,1500]  })
tree = lambda: defaultdict(tree)  
d = tree()
for _, (group0,group1, group2, group3, value) in df.iterrows():
    d['name'][group0]['children'] = group1
    d['name'][group1]['children'] = group2
    d['name'][group2]['children'] = group3
    d['name'][group3]['children'] = value


json.dumps(d)
4

1 回答 1

0

我正在开发一个类似的可视化项目,该项目需要将数据从 Pandas DataFrame 移动到与 D3 一起使用的 JSON 文件。

我在寻找解决方案时遇到了您的帖子,并最终基于此GitHub 存储库以及您在此帖子中提供的链接的输入编写了一些内容。

代码不是很漂亮,而且有点笨拙和缓慢。但根据我的项目,它似乎适用于任何数量的数据,只要它具有三个级别和一个值字段。您应该能够简单地分叉D3 Starburst 笔记本并将flare.json 文件替换为此代码的输出。

我在这里所做的修改,基于原始 GitHub 帖子,是为了考虑三个级别的数据。因此,如果级别 0 节点的名称存在,则从级别 1 开始追加。同样,如果存在 1 级节点的名称,则附加 2 级节点(第三级)。否则,追加数据的完整路径。如果您需要更多,某种递归可能会解决问题,或者只是继续破解它以添加更多级别

# code snip to format Pandas DataFrame to json for D3 Starburst Chart

# libraries
import json
import pandas as pd

# example data with three levels and a single value field
data = {'group1': ['Animal', 'Animal', 'Animal', 'Plant'],
        'group2': ['Mammal', 'Mammal', 'Fish', 'Tree'],
        'group3': ['Fox', 'Lion', 'Cod', 'Oak'],
        'value': [35000, 25000, 15000, 1500]}

df = pd.DataFrame.from_dict(data)

print(df)

""" The sample dataframe
group1  group2 group3  value
0  Animal  Mammal    Fox  35000
1  Animal  Mammal   Lion  25000
2  Animal    Fish    Cod  15000
3   Plant    Tree    Oak   1500
"""

# initialize a flare dictionary
flare = {"name": "flare", "children": []}

# iterate through dataframe values
for row in df.values:
    level0 = row[0]
    level1 = row[1]
    level2 = row[2]
    value = row[3]
    
    # create a dictionary with all the row data
    d = {'name': level0,
          'children': [{'name': level1,
                        'children': [{'name': level2,
                                      'value': value}]}]}
    # initialize key lists
    key0 = []
    key1 = []

    # iterate through first level node names
    for i in flare['children']:
        key0.append(i['name'])

        # iterate through next level node names
        key1 = []
        for _, v in i.items():
            if isinstance(v, list):
                for x in v:
                    key1.append(x['name'])

    # add the full row of data if the root is not in key0
    if level0 not in key0:
        d = {'name': level0,
              'children': [{'name': level1,
                            'children': [{'name': level2,
                                          'value': value}]}]}
        flare['children'].append(d)

    elif level1 not in key1:

        # if the root exists, then append only the next level children

        d = {'name': level1,
              'children': [{'name': level2,
                            'value': value}]}

        flare['children'][key0.index(level0)]['children'].append(d)

    else:

        # if the root exists, then only append the next level children
        
        d = {'name': level2,
             'value': value}

        flare['children'][key0.index(level0)]['children'][key1.index(level1)]['children'].append(d)

# uncomment next three lines to save as json file
# save to some file
# with open('filename_here.json', 'w') as outfile:
#     json.dump(flare, outfile)

print(json.dumps(flare, indent=2))

""" the expected output of this json data
{
  "name": "flare",
  "children": [
    {
      "name": "Animal",
      "children": [
        {
          "name": "Mammal",
          "children": [
            {
              "name": "Fox",
              "value": 35000
            },
            {
              "name": "Lion",
              "value1": 25000
            }
          ]
        },
        {
          "name": "Fish",
          "children": [
            {
              "name": "Cod",
              "value": 15000
            }
          ]
        }
      ]
    },
    {
      "name": "Plant",
      "children": [
        {
          "name": "Tree",
          "children": [
            {
              "name": "Oak",
              "value": 1500
            }
          ]
        }
      ]
    }
  ]
}
"""
于 2020-12-17T02:41:14.053 回答