1

我有一个包含所有数字列的数据框:

import pandas as pd
import numpy as np
np.random.seed(1001)
df = pd.DataFrame(np.random.randn(10, 2), columns=['A', 'B'])

我想创建包含 和 的所有值的通用分A位数B。两者都有一些缺失值。创建公共分位数后,我想对数据框中的值进行编码,以根据值所在的分位数显示标签。我可以为每一列按列执行,但是如何在数据帧上执行呢?

4

1 回答 1

0

我认为你可以先使用,然后:stack DataFrameqcutunstack

import pandas as pd
import numpy as np

np.random.seed(1001)
df = pd.DataFrame(np.random.randn(10, 2), columns=['A', 'B'])
df.ix[0,'A'] = np.nan
df.ix[2,'A'] = np.nan
df.ix[3,'B'] = np.nan
print (df)
          A         B
0       NaN -0.896065
1 -0.306299 -1.339934
2       NaN -0.641727
3  1.307946       NaN
4  0.829115 -0.023299
5 -0.208564 -0.916620
6 -1.074743 -0.086143
7  1.175839 -1.635092
8  1.228194  1.076386
9  0.394773 -0.387701

bins = np.linspace(-1, 1, 5)
print (pd.qcut(df.stack(), bins).unstack())
                  A                 B
0               NaN  (-1.635, -0.209]
1  (-1.635, -0.209]  [-1.34, -0.0861]
2               NaN  (-1.635, -0.209]
3   (-0.209, 1.308]               NaN
4   (-0.209, 1.308]   (-0.209, 1.308]
5  (-1.635, -0.209]  (-1.635, -0.209]
6  (-1.635, -0.209]   (-0.209, 1.308]
7   (-0.209, 1.308]               NaN
8   (-0.209, 1.308]   (-0.209, 1.308]
9   (-0.209, 1.308]  (-1.635, -0.209]
于 2016-06-22T06:55:58.153 回答