2

我有一个 721 x 26 的数据框。有些行的条目是空白的。它不是 NULL 或 NA,而是像下面这样为空。如何删除具有此类条目的那些行?

1         Y    N        Y          N            86.8
2         N    N        Y          N            50.0
3                                               76.8
4         N    N        Y          N            46.6
5         Y    Y        Y          Y            30.0
4

1 回答 1

6

这个问题的答案取决于您对可能出现在“空白”字符串中的事物的偏执程度。这是一种相当谨慎的方法,它将匹配零长度的空白字符串""以及由一个或多个[[:space:]]字符组成的任何字符串(即“制表符、换行符、垂直制表符、换页符、回车符、空格和可能的其他与语言环境相关的字符” ,根据?regex帮助页面)。

## An example data.frame containing all sorts of 'blank' strings
df <- data.frame(A = c("a", "", "\n", " ", " \t\t", "b"),
                 B = c("b", "b", "\t", " ", "\t\t\t", "d"),
                 C = 1:6)

## Test each element to see if is either zero-length or contains just
## space characters
pat <- "^[[:space:]]*$"
subdf <- df[-which(names(df) %in% "C")] # removes columns not involved in the test
matches <- data.frame(lapply(subdf, function(x) grepl(pat, x))) 

## Subset df to remove rows fully composed of elements matching `pat` 
df[!apply(matches, 1, all),]
#   A B C
# 1 a b 1
# 2   b 2
# 6 b d 6

## OR, to remove rows with *any* blank entries
df[!apply(matches, 1, any),]
#   A B C
# 1 a b 1
# 6 b d 6
于 2012-07-16T23:06:39.617 回答