r - R - 合并数据框中的行以填充给定多个标识符的 NA

Question

假设我有一个数据框，其中包含 5 年的数据，显示美国所有 50 个州的 50 个最大城市的凶杀案数量。数据框中还有该城市的人口和拥有的枪支数量。但是，在每一行中只有一个population, homicides or guns（参见df下面的示例）：

> df1 = data.frame(state=1:50, city=rep(1:50, each=50), year=rep(1:5, each=2500), population=sample(1000:200000,12500), homicides=NA, guns=NA)
> df2 = data.frame(state=1:50, city=rep(1:50, each=50), year=rep(1:5, each=2500), population=NA, homicides=sample(1:200,12500,replace=T), guns=NA)
> df3 = data.frame(state=1:50, city=rep(1:50, each=50), year=rep(1:5, each=2500), population=NA, homicides=NA, guns=round((df1$population/sample(2:20,12500,replace=T))))
> df = rbind(df1, df2, df3)

这个生成的数据框比它需要的长 25,000 行，因为代表唯一组合的每一行state, city and year可能包含population, homicide and guns数据，而不仅仅是一个。换句话说，它可能看起来像这样：

df.ideal = data.frame(state=1:50, city=rep(1:50, each=50), year=rep(1:5, each=2500), population=sample(1000:200000,12500), homicides=sample(1:200,12500,replace=T), guns=round((df1$population/sample(2:20,12500,replace=T))))

从开始df，如何合并数据行以为每个组合population, guns and homicides创建一行？state, city, year因此导致 df.ideal

遗憾的是，该解决方案也必须适用于不平衡的数据帧 - 在理想情况下，如果在值替换除 NA 之外的任何内容时出现警告，那就太好了。

r - R - 合并数据框中的行以填充给定多个标识符的 NA

0 回答 0

Related

Reference