4

I have a pandas pivot_table that aggregates 2 data sets in 2 columns across several rows. I would like to add another column that is the difference between the aggregated values in the two existing columns by row. Is there a way to implement this directly in the pivot_table() call? I know that the returned pivot is a dataframe so I can calculate it through other means, but just curious if there is a more efficient way.

Simple example of my data:

  Set     Type   Val
  S1       A     1
  S1       B     2
  S1       B     3
  S2       A     4
  S2       B     5
  S2       C     6

Using the following code where data is my df

piv=pivot_table(data,'Val',rows='Type',cols='Set',aggfunc=sum,fill_value=0.0)

I get the below

    S1  S2
A   1   4
B   5   5
C   0   6

I would like the output to be

    S1  S2 Diff
A   1   4   3
B   5   5   0
C   0   6   6

or just

   Diff
A   3
B   0
C   6
4

1 回答 1

7

简单的。数据框(以及一般的矩阵)使一次操作多个元素变得容易。

定义要应用的功能。

>>> def abs_diff(x, y):
>>>     return abs(x - y)

然后,应用它。

>>> df['Diff'] = abs_diff(df['S1'], df['S2'])

>>> df

   S1  S2  Diff
A   1   4     3
B   5   5     0
C   0   6     6

当然,如果您只想渲染特定列:-

>>> df['Diff']

A    3
B    0
C    6
Name: Diff

>>>当然是python shell提示符)

于 2012-11-28T23:31:52.053 回答