我是生物学家。我的实验输出包含大量特征(存储为列数和 563 行)。这些列是数量为 8603 的特征,这些特征非常高。
因此,当我尝试在 R 中进行 PCA 分析时,它会出现“内存不足”错误。
我也尝试过分片做princomp,但它似乎不适用于我们的方法。
我尝试使用链接中给出的脚本...
但它还是不行:(
我正在尝试使用以下代码
bumpus <- read.table("http://www.ndsu.nodak.edu/ndsu/doetkott/introsas/rawdata/bumpus.html",
skip=20, nrows=49,
col.names=c("id","total","alar","head","humerus","sternum"))
boxplot(bumpus, main="Boxplot of Bumpus' data") ## in this step it is showing the ERROR
# we first standardize the data:
bumpus.scaled <- data.frame( apply(bumpus,2,scale) )
boxplot(bumpus.scaled, main="Boxplot of standardized Bumpus' data")
pca.res <- prcomp(bumpus.scaled, retx=TRUE)
pca.res
# note:
# PC.1 is some kind of average of all the measurements
# => measure of size of the bird
# PC.2 has a negative weight for 'sternum'
# and positive weights for 'alar', 'head' and 'humerus'
# => measure of shape of the bird
# first two principal components:
pca.res$x[,1:2]
plot(pca.res$x[,1:2], pch="", main="PC.1 and PC.2 for Bumpus' data (blue=survived, red=died)")
text(pca.res$x[,1:2], labels=c(1:49), col=c(rep("blue",21),rep("red",28)))
abline(v=0, lty=2)
abline(h=0, lty=2)
# compare to segment plot:
windows()
palette(rainbow(12, s = 0.6, v = 0.75))
stars(bumpus, labels=c(1:49), nrow=6, key.loc=c(20,-1),
main="Segment plot of Bumpus' data", draw.segment=TRUE)
# compare to biplot:
windows()
biplot(pca.res, scale=0)
# what do the arrows mean?
# consider the arrow for sternum:
abline(0, pca.res$rotation[5,2]/pca.res$rotation[5,1])
# consider the arrow for head:
abline(0, pca.res$rotation[3,2]/pca.res$rotation[3,1])
但是第二行
boxplot(bumpus, main="Bumpus 数据的箱线图") ## 显示错误
错误是
Error: cannot allocate vector of size 1.4 Mb
In addition: There were 27 warnings (use warnings() to see them)
请帮忙!