我对 R 很陌生,并且对循环有疑问
在我的真实数据集中,有 80 个国家的 7000 个观测值,有 15 个部门和 6 种组织类型,但这里是一个简化的示例。
country <- c("a","a","a","a","a","a","b","b","b","b","b","b",
"c","c","c","c","c","c","d","d","d","d","d","d")
sector <- c("a","a","a","b","c","c","a","b","b","b","c","c",
"b","b","b","b","c","c","a","a","b","b","c","c")
organization <-c("a","b","c","c","b","a","a","b","b","c","b","b",
"c","a","a","b","b","c","c","b","a","a","b","c")
budget <-c(2,4,3,5,9,7,5,4,3,6,1,2,4,5,6,1,5,3,4,2,3,5,4,6)
table <- data.frame(country, sector, organization, budget)
我想要的是:
- 特定国家/地区特定部门中不同类型组织的数量。
- 分配给不同类型组织的部门总预算的百分比。
我首先必须制作一个子集以仅从国家“a”和部门“a”中选择信息
smalltable <-subset(table, (country == "a") & (sector == "a"))
然后回答我的第一个问题,一个国家的一个部门中每种类型的组织有多少
smalltable$count <- table(smalltable$organization)
然后我需要找到财务的百分比
smalltable$percentage <- smalltable$budget / sum(smalltable$budget)
然后我用了tapply
N <- tapply(smalltable$count, smalltable$organization, FUN=sum)
financialshare <- tapply(smalltable$percentage, smalltable$organization, FUN=sum)
最后结合了这个:
total <- data.frame (smalltable$country,smalltable$sector,smalltable$organization, N,financialshare)
total
这是我需要的小桌子!
但是我在所有 15 个部门和所有 80 个国家都需要这个,所以我需要某种循环函数来运行所有部门的循环并为每个国家重复这个循环。我需要使这些表格尽可能精简,将有关 1 个国家(即 15 个部门)的所有信息汇总到一张表格中。还应从表中删除零值以节省空间。
我需要如何进行?