0

在最新的 pandas 版本(0.8)中,set_index 是否发生了巨大变化?我无法让它按预期工作:

我最初的尝试试图在“id”上设置索引

ipdb> merged2['id']
16    130809
25    130687
32    130686
9      41736
22    131913
7     130691
33    129993
13    130680
28    134295
29    130708

ipdb> merged2.set_index('id')
*** KeyError: 0
ipdb> [type(i) for i in merged2['id']]
[<type 'numpy.float64'>, <type 'numpy.float64'>, <type 'numpy.float64'>, <type 'numpy.float64'>, <type 'numpy.float64'>, <type 'numpy.float64'>, <type 'numpy.float64'>, <type 'numpy.float64'>, <type 'numpy.float64'>, <type 'numpy.float64'>]

当前索引是 int:

ipdb> merged2.index
Int64Index([16, 25, 32,  9, 22,  7, 33, 13, 28, 29])

ipdb> [type(i) for i in merged2.index]
[<type 'numpy.int64'>, <type 'numpy.int64'>, <type 'numpy.int64'>, <type 'numpy.int64'>, <type 'numpy.int64'>, <type 'numpy.int64'>, <type 'numpy.int64'>, <type 'numpy.int64'>, <type 'numpy.int64'>, <type 'numpy.int64'>]

一种解决方法尝试构建一个新索引:

ndx=range(len(merged2))
[type(i) for i in ndx]
[<type 'int'>, <type 'int'>, <type 'int'>, <type 'int'>, <type 'int'>, <type 'int'>, <type 'int'>, <type 'int'>, <type 'int'>, <type 'int'>]


ipdb> merged2.set_index(ndx)
*** KeyError: 'no item named 0'

最后,将我的索引映射为 int 工作:

merged2['id']=map(lambda x: int(x), merged2['id']
merged2.set_index('id')

关于我做错了什么的想法?

4

1 回答 1

1

它似乎在 0.8.1dev 上对我有用。您可以发布堆栈跟踪和/或 merge2 的样子吗?另外你确定你使用的是熊猫 0.8 吗?

In [50]: import pandas as pd

In [51]: idx = pd.Index([16, 25, 32, 9, 22, 7, 33, 13, 28, 29])

In [52]: idx
Out[52]: Int64Index([16, 25, 32,  9, 22,  7, 33, 13, 28, 29])

In [53]: df = DataFrame(np.random.randn(len(idx), 3), idx, ['id', 1, 2])

In [54]: df
Out[54]: 
          id         1         2
16  0.351188  2.082303 -0.143037
25  0.633243 -1.731306  0.749934
32 -0.337893 -0.264249 -0.549856
9  -0.728056  0.786955  1.103877
22  1.131559 -0.255439 -0.397913
7  -1.384519  0.397626 -0.421481
33  1.356455  2.863659 -2.060498
13 -0.355786 -0.051383 -0.609486
28 -0.056607  0.767800  1.433946
29 -0.288202 -0.437992  0.843746

In [55]: df.set_index('id')
Out[55]: 
                  1         2
id                           
 0.351188  2.082303 -0.143037
 0.633243 -1.731306  0.749934
-0.337893 -0.264249 -0.549856
-0.728056  0.786955  1.103877
 1.131559 -0.255439 -0.397913
-1.384519  0.397626 -0.421481
 1.356455  2.863659 -2.060498
-0.355786 -0.051383 -0.609486
-0.056607  0.767800  1.433946
-0.288202 -0.437992  0.843746

In [56]: pd.__version__
Out[56]: '0.8.1.dev-e2633d4'
于 2012-07-19T16:32:03.447 回答