5

我有一个 DataFrame 如下。如何选择第二个索引所在的行['two','three']

index = MultiIndex(levels=[['foo', 'bar', 'baz', 'qux'],
                               ['one', 'two', 'three']],
                       labels=[[0, 0, 0, 1, 1, 2, 2, 3, 3, 3],
                               [0, 1, 2, 0, 1, 1, 2, 0, 1, 2]])
hdf = DataFrame(np.random.randn(10, 3), index=index,
            columns=['A', 'B', 'C'])

In [3]: hdf
Out[3]: 
                  A         B         C
foo one   -1.274689  0.946294 -0.149131
    two   -0.015483  1.630099  0.085461
    three  1.396752 -0.272583 -0.760000
bar one   -1.151217  1.269658  2.457231
    two   -1.657258 -1.271384 -2.429598
baz two    1.124609  0.138720 -1.994984
    three  0.124298 -0.127099 -0.409736
qux one    0.535038  1.139026  0.414842
    two    0.287724  0.461041 -0.268918
    three -0.259649  0.226574 -0.558334
4

2 回答 2

2

一种使用 DataFrame 方法的select方法:

In [4]: hdf.select(lambda x: x[1] in ['two', 'three'])
Out[4]: 
                  A         B         C
foo two   -0.015483  1.630099  0.085461
    three  1.396752 -0.272583 -0.760000
bar two   -1.657258 -1.271384 -2.429598
baz two    1.124609  0.138720 -1.994984
    three  0.124298 -0.127099 -0.409736
qux two    0.287724  0.461041 -0.268918
    three -0.259649  0.226574 -0.558334
于 2012-12-16T09:31:09.963 回答
0

请注意,您还可以执行以下操作:

In [9]: hdf.index.get_level_values(1).isin(['two', 'three'])
Out[9]: array([False,  True,  True, False,  True,  True,  True, False,  True,  True], dtype=bool)

确实应该有更好的语法。

于 2013-01-16T04:04:05.870 回答