python - 一次更改 Pandas DataFrame 多列中的某些值

Question

假设我有以下数据框：

In [1]: df
Out[1]:
  apple banana cherry
0     0      3   good
1     1      4    bad
2     2      5   good

这按预期工作：

In [2]: df['apple'][df.cherry == 'bad'] = np.nan
In [3]: df
Out[3]:
  apple banana cherry
0     0      3   good
1   NaN      4    bad
2     2      5   good

但这不会：

In [2]: df[['apple', 'banana']][df.cherry == 'bad'] = np.nan
In [3]: df
Out[3]:
  apple banana cherry
0     0      3   good
1     1      4    bad
2     2      5   good

为什么？如何在不必写出两行的情况下实现“apple”和“banana”值的转换，如

In [2]: df['apple'][df.cherry == 'bad'] = np.nan
In [3]: df['banana'][df.cherry == 'bad'] = np.nan

score 37 · Accepted Answer

您应该使用 loc 并在没有链接的情况下执行此操作：

In [11]: df.loc[df.cherry == 'bad', ['apple', 'banana']] = np.nan

In [12]: df
Out[12]: 
   apple  banana cherry
0      0       3   good
1    NaN     NaN    bad
2      2       5   good

请参阅有关返回视图与副本的文档，如果您将分配链接到副本（并丢弃），但如果您在一个位置执行此操作，那么熊猫会巧妙地意识到您想要分配给原始文件。

score 4 · Accepted Answer

这是因为df[['apple', 'banana']][df.cherry == 'bad'] = np.nan分配给 DataFrame 的副本。试试这个：

df.ix[df.cherry == 'bad', ['apple', 'banana']] = np.nan

score 1 · Accepted Answer

虽然这个问题很广泛，但答案似乎非常具体，而且不是很通用。这只是为了澄清...

df = pandas.DataFrame({'Test1' :[1,2,3,4,5], 'Test2': [3,4,5,6,7], 'Test3': [5,6,7,8,9]})

   Test1 Test2 Test3
0  1     3     5
1  2     4     6
2  3     5     7
3  4     6     8
4  5     7     9

# When the index or row you want to edit is known
df.loc[3, ['Test1', 'Test2', 'Test3'] = [10, 12, 14]

# When you don't know the index but can find it by looking in a column for a specific value

df.loc[df[df['Test1'] == 4].index[0], ['Test1', 'Test2', 'Test3']] = [10, 12, 14]

   Test1 Test2 Test3
0  1     3     5
1  2     4     6
2  3     5     7
3  10    12    14
4  5     7     9

这两种方法都允许您在一行代码中更改多列的值。

python - 一次更改 Pandas DataFrame 多列中的某些值

3 回答 3

Related

Reference