r - vegan r 包中的 rda 测试错误。未正确读取变量

Question

我正在尝试使用 vegan 包执行简单的 RDA，以使用以下数据框测试深度、盆地和扇区对遗传种群结构的影响。

“ALL”变量是遗传种群分配（结构）。

如果指向我的数据的链接不能正常工作，我将在此处粘贴我的数据框片段。

我以这种方式读取数据：

RDAmorph_Oct6 <- read.csv("RDAmorph_Oct6.csv")

我的问题有两个方面：1）我似乎无法正确读取我的遗传变量。我尝试了三件事来解决这个问题。

gen=rda(ALL ~ Depth + Basin + Sector, data=RDAmorph_Oct6, na.action="na.exclude")
Error in eval(specdata, environment(formula), enclos = globalenv()) : 
  object 'ALL' not found
In addition: There were 12 warnings (use warnings() to see them)

所以，我尝试了这样的事情：

> gen=rda("ALL ~ Depth + Basin + Sector", data=RDAmorph_Oct6, na.action="na.exclude")
Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric

所以我指定了数字

> RDAmorph_Oct6$ALL = as.numeric(RDAmorph_Oct6$ALL)
> gen=rda("ALL ~ Depth + Basin + Sector", data=RDAmorph_Oct6, na.action="na.exclude")
Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric

我真的很困惑。我也试过用指定每个变量dataset$variable，但这也不起作用。

奇怪的是，如果我查看环境变量对不同的复合变量的影响，我可以让 rda 工作

MC = RDAmorph_Oct6[,5:6]
H_morph_var=rda(MC ~ Depth + Basin + Sector, data=RDAmorph_Oct6, na.action="na.exclude")

请注意，我确实尝试仅提取上面遗传 rda 的 ALL 列。这也不起作用。无论如何，这导致了我的第二个问题。

当我尝试绘制 rda 时，我得到了一个非常奇怪的情节。注意三个地方的五个点。我不知道这些是从哪里来的。

我将不得不绘制遗传 rda，我想我会提出同样的问题，所以我想我现在就问。

我已经阅读了几个教程，并尝试了每个问题的多次迭代。我在这里提供的是我认为最好的总结。如果有人能给我一些线索，我将不胜感激。

score 0 · Accepted Answer

这部分类似于加文辛普森的回答。数据框中的分类向量也存在问题。您可以使用library(data.table)和rowid函数将分类变量设置为唯一整数。最好不要使用它们。我也想将ID向量设置为站点名称，但是我现在太懒了。

library(data.table)
RDAmorph_Oct6 <- read.csv("C:/........../RDAmorph_Oct6.csv")

#remove NAs before. I like looking at my dataframes before I analyze them.
RDAmorph_Oct6 <- na.omit(RDAmorph_Oct6)

#I removed one duplicate
RDAmorph_Oct6 <- RDAmorph_Oct6[!duplicated(RDAmorph_Oct6$ID),]

#Create vector with only ALL
ALL  <- RDAmorph_Oct6$ALL

#Create data frame with only numeric vectors and remove ALL
dfn  <- RDAmorph_Oct6[,-c(1,4,11,12)]

#Select all categorical vectors.
dfc  <- RDAmorph_Oct6[,c(1,11,12)]

#Give the categorical vectors unique integers doesn't do this for ID (Why?).
dfc2 <- as.data.frame(apply(dfc, 2, function(x) rowid(x)))

#Bind back with numeric data frame
dfnc <- cbind.data.frame(dfn, dfc2)

#Select only what you need
df   <- dfnc[c("Depth", "Basin", "Sector")]

#The rest you know
rda.out <- rda(ALL ~ ., data=df, scale=T)
plot(rda.out, scaling = 2, xlim=c(-3,2), ylim=c(-1,1))

#Also plot correlations
plot(cbind.data.frame(ALL, df))

扇区和深度的变化最大。几乎合乎逻辑，因为只使用了三个向量。将整数分配给分类向量可能根本没有意义。该函数将自上而下的唯一整数分配给以下唯一字符串。我也不确定您要回答哪个问题。基于此，您可以组织数据框。

score 0 · Accepted Answer

文档?rda说，指定模型的公式的左侧需要是一个数据矩阵。您不能将data对象中变量的名称作为左侧传递（或者至少如果这是预期的，这样做会暴露我们如何解析公式的错误，这会导致进一步的错误）。

您想要的是一个包含ALL公式左侧变量的数据框。

这有效：

library('vegan')
df <- read.csv('~/Downloads/RDAmorph_Oct6.csv')

ALL <- df[, 'ALL', drop = FALSE]

注意drop = FALSE，它会阻止 R 删除空维度（即将单列数据框转换为向量。

然后你原来的电话工作：

ord <- rda(ALL ~ Basin + Depth + Sector, data = df, na.action = 'na.exclude')

score 0 · Accepted Answer

问题是 rda 期望公式的第一部分（ALL在您的代码中）有一个单独的 df ，并且不使用data =参数中的那个。

如上所述，您可以使用分析所需的变量创建一个新的 df，但这里有一个也应该有效的单线解决方案：

gen <- rda(RDAmorph_Oct6$ALL ~ Depth + Basin + Sector, data = RDAmorph_Oct6, na.action = na.exclude)

r - vegan r 包中的 rda 测试错误。未正确读取变量

3 回答 3

Related

Reference