1

We can use .idxmax to get the maximum value of a dataframe­(df). My problem is that I have a df with several columns (more than 10), one of a column has identifiers of same value. I need to extract the identifiers with the maximum value:

>df

id  value
a   0
b   1
b   1
c   0
c   2
c   1

Now, this is what I'd want:

>df

id  value
a   0
b   1
c   2

I am trying to get it by using df.groupy(['id']), but it is a bit tricky:

df.groupby(["id"]).ix[df['value'].idxmax()]

Of course, that doesn't work. I fear that I am not on the right path, so I thought I'd ask you guys! Thanks!

4

1 回答 1

5

关!Groupby id,然后使用value列;返回每个组的最大值。

In [14]: df.groupby('id')['value'].max()
Out[14]: 
id
a     0
b     1
c     2
Name: value, dtype: int64

Op 想要将这些位置提供回框架,只需创建一个转换并分配。

In [17]: df['max'] = df.groupby('id')['value'].transform(lambda x: x.max())

In [18]: df
Out[18]: 
  id  value  max
0  a      0    0
1  b      1    1
2  b      1    1
3  c      0    2
4  c      2    2
5  c      1    2
于 2013-10-22T15:57:23.673 回答