在计算 pearsons 相关性时,下面的脚本使用相同的数据对我有用。我最近对其进行了调整,以创建一个协方差矩阵以输入到 pca 中。我在一个论坛上读到,输入预先创建的协方差矩阵可能会避免内存问题,但对我来说并非如此。运行协方差矩阵时出现以下错误:
Error: cannot allocate vector of size 1.1 Gb
In addition: Warning messages:
1: In na.omit.default(cbind(x, y)) :
Reached total allocation of 6141Mb: see help(memory.size)
2: In na.omit.default(cbind(x, y)) :
Reached total allocation of 6141Mb: see help(memory.size)
3: In na.omit.default(cbind(x, y)) :
Reached total allocation of 6141Mb: see help(memory.size)
4: In na.omit.default(cbind(x, y)) :
Reached total allocation of 6141Mb: see help(memory.size)
谁能建议一种更有效的方法来做到这一点,这样我就不会遇到内存问题?如果我在这里完全偏离基础首先计算协方差,那很好。PCA 是我最终唯一需要的东西。我的数据是 arcGIS 栅格格式的 12 个 1 波段栅格,每个数据大小为 581.15 mb。非常感激任何的帮助。
library(rgdal)
library(raster)
setwd("K:/Documents/SDSU/Thesis/GIS Data All/GIS Layers/Generated_Layers/GridsForCor")
# List the full path to each raster:
raster_files = c('aspectclp',
'lakedistclp',
'ocdistclp',
'popdenclp',
'roaddistclp',
'scurveclp',
'sdemclp',
'solarradclp',
'sslopeclp',
'vegcatclp',
'canopcvrclp',
'canophtclp')
cov_matrix <- matrix(NA, length(raster_files), length(raster_files))
for (outer_n in 1:length(raster_files)) {
outer_raster <- raster(raster_files[outer_n])
# Start this loop at outer_n rather than 1 so that we don't compute the
# same covariance twice. At the end of the loops cov_matrix will be upper
# triangular, with the lower triangle all NA, and the diagonal all NA
# (since the diagonal would all be 1 anyway).
for (inner_n in (outer_n):length(raster_files)) {
# Don't compute correlation of a raster with itself:
if (inner_n == outer_n) {next}
inner_raster <- raster(raster_files[inner_n])
cov_matrix[outer_n, inner_n] <- cov(outer_raster[], inner_raster[],
use='complete.obs', method = "spearman")
}
}
pca_matrix <- princomp(raster_files, cor = FALSE, covmat = cov_matrix))
# Writing to a txt file & csv file
write.table(pca_matrix, "PCA.txt", sep="\t", row.names = FALSE)
write.csv(pca_matrix, "PCA.csv") enter code here