4

当特定列也满足单独的条件时,我正在尝试选择满足特定条件的 pandas DataFrame 中的所有单元格。

给定以下数据框:

      A    B    C    D
1/1   0    1    0    1
1/2   2    1    1    1
1/3   3    0    1    0 
1/4   1    0    1    2
1/5   1    0    1    1
1/6   2    0    2    1
1/7   3    5    2    3

我想以某种方式选择列大于其先前值的数据,何时D也> 1。我当前尝试使用的语法是:

matches = df[(df > df.shift(1)) & (df.D > 1)]

但是,当我这样做时,我收到以下错误:

TypeError: 无法使用块值操作 [array([nan, nan, nan, nan], dtype=object)] [操作数无法与形状一起广播 (2016) (4) ]

注意:错误是我的实际代码的直接副本和过去,因此错误中的描述和形状不会直接与我的示例 DataFrame 相关。

我知道这df.D > 1是导致问题的原因,并且直接比较列D是有效的(df > df.D例如)。D尝试与 value进行比较时,我的语法有什么问题,我1该如何做到这一点?

4

2 回答 2

4

应该可以直接工作,但是 pandas 没有广播和操作符(将在 0.14 中发生)。这是一种解决方法。

In [74]: df
Out[74]: 
     A  B  C  D
1/1  0  1  0  1
1/2  2  1  1  1
1/3  3  0  1  0
1/4  1  0  1  2
1/5  1  0  1  1
1/6  2  0  2  1
1/7  3  5  2  3

这是一个 where 操作,本质上把np.nanwhere 条件为 False

In [78]: x = df[df>df.shift(1)]

In [79]: x
Out[79]: 
      A   B   C   D
1/1 NaN NaN NaN NaN
1/2   2 NaN   1 NaN
1/3   3 NaN NaN NaN
1/4 NaN NaN NaN   2
1/5 NaN NaN NaN NaN
1/6   2 NaN   2 NaN
1/7   3   5 NaN   3

按第二个条件选择

In [80]: x[df.D>1]
Out[80]: 
      A   B   C  D
1/4 NaN NaN NaN  2
1/7   3   5 NaN  3
于 2013-10-21T12:43:56.973 回答
0

I think the problem is actually that the boolean array from the shift operation is one short of the the other conditional. Try adding a false to the first conditional at index zero you should then be able to combine the two conditionals.

I'd the problem really is with the second conditional could you post the result of

DF.dtypes

it looks like it's not int type given the nan array error

于 2013-10-21T00:54:35.917 回答