python - Group Pandas DataFrame by row name

Question

I have a simple Pandas DataFrame with row names and 2 columns, sort of like the following.

from pandas import DataFrame, Series
row_names = ['row1', 'row2', 'row2', 'row4']
df = DataFrame({'col1': Series([1, 2, 3, 4], index=row_names),
                'col2': Series([0, 1, 0, 1], index=row_names)})

As with the example above, some row names repeat. I want to group my DataFrame by row names so that I can then perform aggregate operations by group (e.g., count, mean).

For instance, I might want to find out that row1 and row4 appear once each in my df whereas row2 appears once.

I know of the groupby method, but from the examples I've seen online it only groups by column values, not row names. Is that the case? Should I just make my rownames a column in the DataFrame?

score 2 · Accepted Answer

检查文档字符串（如果您使用的是IPython，它只是df.groupby?<enter>）

Group series using mapper (dict or key function, apply given function
to group, return result as series) or by a series of columns

Parameters
----------
by : mapping function / list of functions, dict, Series, or tuple /
    list of column names.
    Called on each element of the object index to determine the groups.
    If a dict or Series is passed, the Series or dict VALUES will be
    used to determine the groups
axis : int, default 0
level : int, level name, or sequence of such, default None
    If the axis is a MultiIndex (hierarchical), group by a particular
    level or levels
...

你想要的level论点：

In [20]: df.groupby(level=0).count()
Out[20]: 
      col1  col2
row1     1     1
row2     2     2
row4     1     1

[3 rows x 2 columns]

python - Group Pandas DataFrame by row name

1 回答 1

Related

Reference