0

I have a data frame where there's multiple options for a certain index (1-M relationship) - e.g. States as Index and Counties as respective columns. I want to group it in a way that creates just one column but with all the values. This is a basic transformation but somehow I can't get it right.

Sorry I don't know how to insert code that actually is already run so here I present the code to create example DFs as to what I'd like to create.

pd.DataFrame({'INDEX': ['INDEX1','INDEX2','INDEX3'],
                    'col1': ['a','b','d'],
                   'col2': ['c','f',np.nan],
                   'col3': ['e',np.nan,np.nan]})

and I want it to transform it so that I end up with this data frame:

pd.DataFrame({'INDEX': ['INDEX1','INDEX1','INDEX1','INDEX2','INDEX2','INDEX3'],
                    'col1': ['a','c','e','b','f','d']})
4

1 回答 1

0

你可以melt在这里使用:

df = pd.melt(df, id_vars=['INDEX']).drop(columns=['variable']).dropna()
print(df)

    INDEX value
0  INDEX1     a
1  INDEX2     b
2  INDEX3     d
3  INDEX1     c
4  INDEX2     f
6  INDEX1     e
于 2020-10-18T15:39:06.320 回答