3

像这样的样本:

In [39]: ts = pd.Series(np.random.randn(20),index=pd.date_range('1/1/2000',periods=20))
In [40]: t = pd.DataFrame(ts,columns=['base'],index=ts.index)
In [42]: t['shift_one'] = t.base - t.base.shift(1)
In [43]: t['shift_two'] = t.shift_one.shift(1)
In [44]: t
Out[44]: 
                base  shift_one  shift_two
2000-01-01 -1.239924        NaN        NaN
2000-01-02  1.116260   2.356184        NaN
2000-01-03  0.401853  -0.714407   2.356184
2000-01-04 -0.823275  -1.225128  -0.714407
2000-01-05 -0.562103   0.261171  -1.225128
2000-01-06  0.347143   0.909246   0.261171
.............
2000-01-20 -0.062557  -0.467466   0.512293

现在,如果我们使用 t[t.shift_one > 0 ],它可以正常工作,但是当我们使用时: In [48]: t[t.shift_one > 0 and t.shift_two < 0] -------- -------------------------------------------------- ----------------- ValueError Traceback (last last call last) in () ----> 1 t[t.shift_one > 0 and t.shift_two < 0]

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

假设我们想得到一个包含两个条件的子集,怎么做?多谢。

4

1 回答 1

6

你需要parens和use &,而不是and

在此处查看文档:http: //pandas.pydata.org/pandas-docs/dev/indexing.html#boolean-indexing

In [3]: ts = pd.Series(np.random.randn(20),index=pd.date_range('1/1/2000',periods=20))

In [4]: t = pd.DataFrame(ts,columns=['base'],index=ts.index)

In [5]: t['shift_one'] = t.base - t.base.shift(1)

In [6]: t['shift_two'] = t.shift_one.shift(1)

In [7]: t
Out[7]: 
                base  shift_one  shift_two
2000-01-01 -1.116040        NaN        NaN
2000-01-02  1.592079   2.708118        NaN
2000-01-03  0.958642  -0.633436   2.708118
2000-01-04  0.431970  -0.526672  -0.633436
2000-01-05  1.275624   0.843654  -0.526672
2000-01-06  0.498401  -0.777223   0.843654
2000-01-07 -0.351465  -0.849865  -0.777223
2000-01-08 -0.458742  -0.107277  -0.849865
2000-01-09 -2.100404  -1.641662  -0.107277
2000-01-10  0.601658   2.702062  -1.641662
2000-01-11 -2.026495  -2.628153   2.702062
2000-01-12  0.391426   2.417921  -2.628153
2000-01-13 -1.177292  -1.568718   2.417921
2000-01-14 -0.374543   0.802749  -1.568718
2000-01-15  0.338649   0.713192   0.802749
2000-01-16 -1.124820  -1.463469   0.713192
2000-01-17  0.484175   1.608995  -1.463469
2000-01-18 -1.477772  -1.961947   1.608995
2000-01-19  0.481843   1.959615  -1.961947
2000-01-20  0.760168   0.278325   1.959615

In [8]: t[(t.shift_one>0) & (t.shift_two<0)]
Out[8]: 
                base  shift_one  shift_two
2000-01-05  1.275624   0.843654  -0.526672
2000-01-10  0.601658   2.702062  -1.641662
2000-01-12  0.391426   2.417921  -2.628153
2000-01-14 -0.374543   0.802749  -1.568718
2000-01-17  0.484175   1.608995  -1.463469
2000-01-19  0.481843   1.959615  -1.961947
于 2013-06-03T14:41:21.327 回答