0

尝试在 R 中查找 Iris 数据的协方差矩阵时,我不断得到 NA。

library(ggplot2)
library(dplyr)

dim(iris)
head(iris)

numIris <- iris %>% 
  select_if(is.numeric)

plot(numIris[1:100,])

Xraw <- numIris[1:1000,]

plot(iris[1:150,-c(5)]) #species name is the 5th column; excluding it here.
Xraw = iris[1:1000,-c(5)] # this excludes the 5th column, which is the species column
#first, to get covariance, we need to subtract the mean from each column

X = scale(Xraw, scale = FALSE)

head(X)

Xs <- scale(Xraw, scale = TRUE)
head(Xs)

covMat  = (t(X)%*%X)/ (nrow(X)-1)
head(covMat)
4

1 回答 1

2

你有没有理由不能使用cov(numIris)

通过尝试选择只有 150 行的矩阵/数据框的 1000 行,最终会得到 850 行充满NA值的行(尝试tail(Xraw)查看)。如果你设置Xraw <- iris[, -5]并从那里开始,你会得到这样all.equal(covMat, cov(iris[, -5]))的结果TRUE

于 2021-10-31T23:38:15.277 回答