python-3.x - 使用 pandas 来展平字典

Question

   [{'name': 'Test Item1',
  'column_values': [{'title': 'col2', 'text': 'Oladimeji Olaolorun'},
   {'title': 'col3', 'text': 'Working on it'},
   {'title': 'col4', 'text': '2019-09-17'},
   {'title': 'col5', 'text': '1'}],
  'group': {'title': 'Group 1'}},
 {'name': 'Test Item2',
  'column_values': [{'title': 'col2', 'text': 'Lucie Phillips'},
   {'title': 'col3', 'text': 'Done'},
   {'title': 'col4', 'text': '2019-09-20'},
   {'title': 'col5', 'text': '2'}],
  'group': {'title': 'Group 1'}},
 {'name': 'Test Item3',
  'column_values': [{'title': 'col2', 'text': 'David Binns'},
   {'title': 'col3', 'text': None},
   {'title': 'col4', 'text': '2019-09-25'},
   {'title': 'col5', 'text': '3'}],
  'group': {'title': 'Group 1'}},
 {'name': 'Item 4',
  'column_values': [{'title': 'col2', 'text': 'Lucie Phillips'},
   {'title': 'col3', 'text': 'Stuck'},
   {'title': 'col4', 'text': '2019-09-06'},
   {'title': 'col5', 'text': '4'}],
  'group': {'title': 'Group 2'}},
 {'name': 'Item 5',
  'column_values': [{'title': 'col2', 'text': 'David Binns'},
   {'title': 'col3', 'text': 'Done'},
   {'title': 'col4', 'text': '2019-09-28'},
   {'title': 'col5', 'text': '5'}],
  'group': {'title': 'Group 2'}},
 {'name': 'item 6',
  'column_values': [{'title': 'col2', 'text': 'Lucie Phillips'},
   {'title': 'col3', 'text': 'Done'},
   {'title': 'col4', 'text': '2020-03-05'},
   {'title': 'col5', 'text': '76'}],
  'group': {'title': 'Group 2'}}]

我目前正在从 Monday.com 的 API 中提取数据，我的调用返回上面的响应，上面带有一个类似上面的 dict 我正在尝试找到将这个 dict 扁平化为 Dataframe 的最佳方法。

当我似乎得到以下结果时，我目前正在使用 json_normalize(results['data']['boards'][0]['items'])

所需的输出是如下表

score 0 · Accepted Answer

使用模块glom，很容易从嵌套列表中提取所需的“文本”键。将数据读入熊猫数据框，拆分名称列，最后合并回父数据框。

from glom import glom

spec = {'names':('column_values',['text']),
        'group': 'group.title',
        'Name' : 'name'
        }

该函数将 None 条目替换为字符串“None”

def replace_none(val_list):
    val_list = ['None' if v is None else v for v in val_list]
    return val_list

for i in M:
    i['names'] = replace_none(i['names'])

df = pd.DataFrame(M)

df_split = df['names'].str.join(',').str.split(',',expand=True).add_prefix('Col')

df = df.drop('names',axis=1)

pd.concat([df,df_split],axis=1)

    group   Name         Col0                Col1              Col2   Col3
0   Group 1 Test Item1  Oladimeji Olaolorun Working on it   2019-09-17  1
1   Group 1 Test Item2  Lucie Phillips      Done            2019-09-20  2
2   Group 1 Test Item3  David Binns         None            2019-09-25  3
3   Group 2 Item 4      Lucie Phillips      Stuck           2019-09-06  4
4   Group 2 Item 5      David Binns         Done            2019-09-28  5
5   Group 2 item 6      Lucie Phillips      Done            2020-03-05  76

更新：上面的所有代码都是不必要的。下面的代码更简单、更简洁、更清晰。

d=[]
for ent in data:
    for entry in ent['column_values']:
        entry.update({'name':ent['name']})
        entry.update({'group':ent['group']['title']})
        d.append(entry)

res = pd.DataFrame(d)

res.set_index(['name','group','title']).unstack()

                                                               text
              title col2                col3            col4    col5
name         group              
Item 4      Group 2 Lucie Phillips      Stuck           2019-09-06  4
Item 5      Group 2 David Binns         Done            2019-09-28  5
Test Item1  Group 1 Oladimeji Olaolorun Working on it   2019-09-17  1
Test Item2  Group 1 Lucie Phillips      Done            2019-09-20  2
Test Item3  Group 1 David Binns         None            2019-09-25  3
item 6      Group 2 Lucie Phillips      Done            2020-03-05  76

python-3.x - 使用 pandas 来展平字典

1 回答 1

Related

Reference