0

我是熊猫和python的新手。我有一个看起来像这样的字典:

{'A': ['aa', 'ab', 'ac'], 'B': ['ba', 'bb'], 'C': []}

我想得到一个看起来像这样的数据框:

数据框

Keys values 
A    aa
A    ab
A    ac
B    ba
B    bb
C    -

请帮忙。

4

4 回答 4

2

您可以先预处理字典然后创建df,

# replace empty list by [np.nan]
d_ = {k:[np.nan] if len(v) == 0 else v for k,v in d.items() }

# flatten the dictionary as k:v for each value in the list of values
df = pd.DataFrame([[k,i] for k,v in d_.items() for i in v])

    0   1
0   A   aa
1   A   ab
2   A   ac
3   B   ba
4   B   bb
5   C   NaN
于 2020-12-29T00:55:42.930 回答
1

尝试explode

out = pd.Series(dct).explode().reset_index(name='value')
  index value
0     A    aa
1     A    ab
2     A    ac
3     B    ba
4     B    bb
5     C   NaN
于 2020-12-29T01:19:51.263 回答
0

尝试使用字典理解和for循环:

dct = {'A': ['aa', 'ab', 'ac'], 'B': ['ba', 'bb'], 'C': []}
df = pd.DataFrame({k: v + ([pd.np.nan] * (3 - len(v))) for k, v in dct.items()})
melted = pd.melt(df).dropna()
for i in dct.keys():
    if melted['variable'].tolist().count(i) == 0:
        melted.loc[len(melted)] = [i, pd.np.nan]
melted = melted.sort_values('variable')
print(melted)

输出:

  variable value
0        A    aa
1        A    ab
2        A    ac
3        B    ba
4        B    bb
5        C   NaN
于 2020-12-29T00:50:43.327 回答
0

只需使用键keysvalues迭代原始字典即可形成一个字典:

import pandas as pd
import numpy as np

orig_dict = {'A': ['aa', 'ab', 'ac'], 'B': ['ba', 'bb'], 'C': []}

new_dict = {'keys': [], 'values': []} 
for k, v in orig_dict.items():
    if not v:
        new_dict['keys'].append(k)
        new_dict['values'].append(np.nan)
        continue
    new_dict['keys'].extend([k]*len(v))
    new_dict['values'].extend(v)  

df = pd.DataFrame.from_dict(new_dict, orient="columns")

>>>  df 
    keys values
0    A     aa
1    A     ab
2    A     ac
3    B     ba
4    B     bb
5    C    NaN
于 2020-12-29T00:54:18.960 回答