0

我有一个看起来像这样的数据框:

  year country inhabitants
1    1       A          15
2    2       A          10
3    3       A          24
4    1       B          76
5    2       B          69
6    3       B          58
7    1       C         120
8    2       C         131
9    3       C         128

现在,我想创建所有国家每年的“居民”总和。即,我的解决方案如下所示:

  year country inhabitants sum_inhabitants
1    1       A          15             211
2    2       A          10             210
3    3       A          21             207
4    1       B          76             211
5    2       B          69             210
6    3       B          58             207
7    1       C         120             211
8    2       C         131             210
9    3       C         128             207

我的原始数据框包含更多观察结果,这就是为什么我不能手动进行计算。

4

2 回答 2

0

使用dplyr包,您可以执行以下操作:

library(dplyr)
df %>% group_by(year) %>% summarise(sum_inhabitants = sum(inhabitants))

如果您真的想在该列中保留重复项并将其添加到原始数据框中,请将summarise上面的更改为mutate,这将为您提供上面指定的确切输出。

如果您想按年份和国家/地区获得它,您可以这样做:

df %>% group_by(year, country) %>% summarise(sum_inhabitants = sum(inhabitants))
于 2016-01-14T16:05:35.360 回答
0

We can use ave to sum by year with no need for outside packages. The advantage that it has over aggregate is that it will not summarize but rather fill in-line:

df$sum_inhabitants <- ave(df$inhabitants, df$year, FUN=sum)
# year country inhabitants sum_inhabitants
# 1    1       A          15             211
# 2    2       A          10             210
# 3    3       A          21             207
# 4    1       B          76             211
# 5    2       B          69             210
# 6    3       B          58             207
# 7    1       C         120             211
# 8    2       C         131             210
# 9    3       C         128             207
于 2016-01-14T16:04:05.913 回答