编辑:我坚持
纠正Backlin
......使用
而不是%in%
!的使用建议更好,因为结果仍然是一个data.frame,即使一个人只选择一列作为输出,即== x & !is.na()
MadScone
subset()
> class(subset(df, Ni == 1 & Zn == 1 & is.na(Cu), Species))
[1] "data.frame"
#whereby we get a vector when only one column is selected...
> class(df[df$Ni %in% 1 & df$Zn %in% 1 & is.na(df$Cu), 1])
[1] "integer"
# but we get data.frame when using multiple columns...
> class(df[df$Ni %in% 1 & df$Zn %in% 1 & is.na(df$Cu), 1:2])
[1] "data.frame"
我只是留下我低于标准的答案来提及这个替代成语,因为一个人应该避免!
设置:
> df <- structure(list(Species = 1:4, Ni = c(1, NA, 1, 1), Zn = c(1,
1, 1, 1), Cu = c(NA, NA, 1, NA)), .Names = c("Species", "Ni",
"Zn", "Cu"), row.names = c(NA, -4L), class = "data.frame")
> df
Species Ni Zn Cu
1 1 1 1 NA
2 2 NA 1 NA
3 3 1 1 1
4 4 1 1 NA
询问:
> df[df$Ni == 1 & !is.na(df$Ni)
& df$Zn == 1 & !is.na(df$Zn)
& is.na(!df$Cu), ]
Species Ni Zn Cu
1 1 1 1 NA
4 4 1 1 NA
NA 值的诀窍是明确排除它们,例如 Ni、请求值 1 和 !is.na() 等。如果不这样做会导致查找记录,例如,Ni 为 NA
如上所述,该 df[df$Ni %in% 1 & df$Zn %in% 1 & is.na(!df$Cu), ]
成语更可取,并且使用子集()通常更好。
> df[df$Ni == 1 & df$Zn == 1 & is.na(!df$Cu), ]
Species Ni Zn Cu
1 1 1 1 NA
NA NA NA NA NA # OOPS...
4 4 1 1 NA