I have a simple Pandas DataFrame with row names and 2 columns, sort of like the following.
from pandas import DataFrame, Series
row_names = ['row1', 'row2', 'row2', 'row4']
df = DataFrame({'col1': Series([1, 2, 3, 4], index=row_names),
'col2': Series([0, 1, 0, 1], index=row_names)})
As with the example above, some row names repeat. I want to group my DataFrame by row names so that I can then perform aggregate operations by group (e.g., count, mean).
For instance, I might want to find out that row1
and row4
appear once each in my df
whereas row2
appears once.
I know of the groupby
method, but from the examples I've seen online it only groups by column values, not row names. Is that the case? Should I just make my rownames a column in the DataFrame?