2

我有一个矩阵,其中有许多列,其中所有值都是NA. 所以我想省略所有完全是NA. 那么我该怎么做呢?

4

3 回答 3

3

I assume you only wish to omit columns where all observations are NA; your question is somewhat ambiguous.

This code omits columns that are entirely NA, for a matrix, x, returning only the columns with at least one non-NA value:

x[,apply(!is.na(x),2,any)] 
于 2013-04-01T23:48:16.587 回答
2
mtx[ , -which( colSums(is.na(mtx)) == nrow(mtx) ) ]

如果您想排除 NA 条目超过 50% 的列,则:

mtx[ , -which( colSums(is.na(mtx)) > nrow(mtx)/2 ) ]
于 2013-04-02T01:25:25.317 回答
1

您可以使用函数“na.omit()”删除包含 NA 观测值的行。此函数删除行并将返回没有 NA 的数据框。

如果您希望删除每个观察值都包含 NA 的列...

我不确定是否有内置的 R 函数可以做到这一点。但是,我们可能会考虑某种用户定义的过程,该过程会删除具有最多 NA 的列......

### Assume 'df' is your data frame with observational data:

### Apply a function to check whether each observation contains an NA
count <- sapply(df, is.na)
### Within each column, ask for the number of missing observations
count <- colSums(count)
### Ask R which columns have the most missing observations
index <- which.max(count)
### Subset 'df' to exclude columns with the most NA's
df <- df[, -index]
于 2013-04-01T21:57:54.643 回答