r - dividing cells by subtotal of a dataframe in R

Question

This is a beginner's question, but coming from Stata this seems strangely tricky to me. I would be grateful for any hint.

I have a dataframe with the variables district_id, year, party, and votes. I would like to divide the votes per party per district per year (=each row) by the total of that party's vote in this year (= here displayed in blocks).So how many percentages did one district contribute to the overall votes received by one party in a year?

The structure is

 district_i year    party   votes

  1 2001    party1   24
  2 2001    party1   56
  3 2001    party1   12

  1 2002    party1   40
  2 2002    party1   749
  3 2002    party1   26

  1 2001    party2   34
  2 2001    party2   48
  3 2001    party2   23

  1 2002    party2   34
  2 2002    party2   48
  3 2002    party2   98

I created the subtotals for each party/district/year-group with

agg <- aggregate(df$votes, list(df$party, df$year), FUN="sum")

But how can I divide the cells in the dataframe by the stored results in agg? In the end I would like to have a new column with the percentage.

Isn't there an easier way (like egen .. by: in Stata)?

score 0 · Accepted Answer

0

尝试这个：

transform(df, percent = 100 * ave(votes, year, party, FUN = prop.table))

于 2013-07-04T18:12:27.737 回答

score 0 · Accepted Answer

像这样？

DF <- read.table(text="district_i year    party   votes
  1 2001    party1   24
  2 2001    party1   56
  3 2001    party1   12
  1 2002    party1   40
  2 2002    party1   749
  3 2002    party1   26
  1 2001    party2   34
  2 2001    party2   48
  3 2001    party2   23
  1 2002    party2   34
  2 2002    party2   48
  3 2002    party2   98", header=TRUE)

library(plyr)
ddply(DF, .(year,party), transform, contrib = votes / sum(votes))

#    district_i year  party votes    contrib
# 1           1 2001 party1    24 0.26086957
# 2           2 2001 party1    56 0.60869565
# 3           3 2001 party1    12 0.13043478
# 4           1 2001 party2    34 0.32380952
# 5           2 2001 party2    48 0.45714286
# 6           3 2001 party2    23 0.21904762
# 7           1 2002 party1    40 0.04907975
# 8           2 2002 party1   749 0.91901840
# 9           3 2002 party1    26 0.03190184
# 10          1 2002 party2    34 0.18888889
# 11          2 2002 party2    48 0.26666667
# 12          3 2002 party2    98 0.54444444

r - dividing cells by subtotal of a dataframe in R

2 回答 2

Related

Reference