25

假设我有以下数据框:

In [1]: df
Out[1]:
  apple banana cherry
0     0      3   good
1     1      4    bad
2     2      5   good

这按预期工作:

In [2]: df['apple'][df.cherry == 'bad'] = np.nan
In [3]: df
Out[3]:
  apple banana cherry
0     0      3   good
1   NaN      4    bad
2     2      5   good

但这不会:

In [2]: df[['apple', 'banana']][df.cherry == 'bad'] = np.nan
In [3]: df
Out[3]:
  apple banana cherry
0     0      3   good
1     1      4    bad
2     2      5   good

为什么?如何在不必写出两行的情况下实现“apple”和“banana”值的转换,如

In [2]: df['apple'][df.cherry == 'bad'] = np.nan
In [3]: df['banana'][df.cherry == 'bad'] = np.nan
4

3 回答 3

37

您应该使用 loc 并在没有链接的情况下执行此操作:

In [11]: df.loc[df.cherry == 'bad', ['apple', 'banana']] = np.nan

In [12]: df
Out[12]: 
   apple  banana cherry
0      0       3   good
1    NaN     NaN    bad
2      2       5   good

请参阅有关返回视图与副本的文档,如果您将分配链接到副本(并丢弃),但如果您在一个位置执行此操作,那么熊猫会巧妙地意识到您想要分配给原始文件。

于 2013-11-08T20:14:12.477 回答
4

这是因为df[['apple', 'banana']][df.cherry == 'bad'] = np.nan分配给 DataFrame 的副本。试试这个:

df.ix[df.cherry == 'bad', ['apple', 'banana']] = np.nan
于 2013-11-08T20:14:38.900 回答
1

虽然这个问题很广泛,但答案似乎非常具体,而且不是很通用。这只是为了澄清...

df = pandas.DataFrame({'Test1' :[1,2,3,4,5], 'Test2': [3,4,5,6,7], 'Test3': [5,6,7,8,9]})

   Test1 Test2 Test3
0  1     3     5
1  2     4     6
2  3     5     7
3  4     6     8
4  5     7     9

# When the index or row you want to edit is known
df.loc[3, ['Test1', 'Test2', 'Test3'] = [10, 12, 14]

# When you don't know the index but can find it by looking in a column for a specific value

df.loc[df[df['Test1'] == 4].index[0], ['Test1', 'Test2', 'Test3']] = [10, 12, 14]

   Test1 Test2 Test3
0  1     3     5
1  2     4     6
2  3     5     7
3  10    12    14
4  5     7     9

这两种方法都允许您在一行代码中更改多列的值。

于 2021-08-16T22:03:15.103 回答