我有这个数据框:
set.seed(50)
data <- data.frame(age=c(rep("juv", 10), rep("ad", 10)),
sex=c(rep("m", 10), rep("f", 10)),
size=c(rep("large", 10), rep("small", 10)),
length=rnorm(20),
width=rnorm(20),
height=rnorm(20))
data$length[sample(1:20, size=8, replace=F)] <- NA
data$width[sample(1:20, size=8, replace=F)] <- NA
data$height[sample(1:20, size=8, replace=F)] <- NA
age sex size length width height
1 juv m large NA -0.34992735 0.10955641
2 juv m large -0.84160374 NA -0.41341885
3 juv m large 0.03299794 -1.58987765 NA
4 juv m large NA NA NA
5 juv m large -1.72760411 NA 0.09534935
6 juv m large -0.27786453 2.66763339 0.49988990
7 juv m large NA NA NA
8 juv m large -0.59091244 -0.36212039 -1.65840096
9 juv m large NA 0.56874633 NA
10 juv m large NA 0.02867454 -0.49068623
11 ad f small 0.29520677 0.19902339 NA
12 ad f small 0.55475223 -0.85142228 0.33763747
13 ad f small NA NA -1.96590570
14 ad f small 0.19573384 0.59724896 -2.32077461
15 ad f small -0.45554055 -1.09604786 NA
16 ad f small -0.36285547 0.01909655 1.16695158
17 ad f small -0.15681338 NA NA
18 ad f small NA NA NA
19 ad f small NA 0.40618657 -1.33263085
20 ad f small -0.32342568 NA -0.13883976
我正在尝试创建一个函数来计算每个 NA 值的数量length
,width
以及数据框中height
三个因素的每个级别。我试过这个:
exploreMissingValues <- function(dataframe, factors, variables){
library(plyr)
Variables <- list(variables)
llply(Variables, function(x) ddply(dataframe, .(factors),
summarise,
number.of.NA=length(x[is.na(x)])))
}
exploreMissingValues(data,
c("age", "sex", "size"),
c("length", "width", "height"))
...但这给出了一个错误。我怎样才能让这个函数在数据帧的每个级别返回 NA 值的数量?