This is a beginner's question, but coming from Stata this seems strangely tricky to me. I would be grateful for any hint.
I have a dataframe with the variables district_id, year, party, and votes. I would like to divide the votes per party per district per year (=each row) by the total of that party's vote in this year (= here displayed in blocks).So how many percentages did one district contribute to the overall votes received by one party in a year?
The structure is
district_i year party votes
1 2001 party1 24
2 2001 party1 56
3 2001 party1 12
1 2002 party1 40
2 2002 party1 749
3 2002 party1 26
1 2001 party2 34
2 2001 party2 48
3 2001 party2 23
1 2002 party2 34
2 2002 party2 48
3 2002 party2 98
I created the subtotals for each party/district/year-group with
agg <- aggregate(df$votes, list(df$party, df$year), FUN="sum")
But how can I divide the cells in the dataframe by the stored results in agg? In the end I would like to have a new column with the percentage.
Isn't there an easier way (like egen .. by:
in Stata)?