r - 在将响应变量删除以进行标准化后，如何将其重新合并到数据框中？

Question

我有一个包含 61 列（60 个解释变量和 1 个响应变量）的数据集。

所有的解释变量都是数字的，响应是分类的（默认）。一些前。变量具有负值（财务数据），因此标准化而不是标准化似乎更明智。但是，当使用“应用”函数进行标准化时，我必须先删除响应变量，所以我这样做：

模型 <- read.table ......

modelwithnoresponse <- model 
modelwithnoresponse$Default <- NULL
means <- apply(modelwithnoresponse,2mean)
standarddeviations <- apply(modelwithnoresponse,2,sd)
modelSTAN <- scale(modelwithnoresponse,center=means,scale=standarddeviations)

到目前为止一切顺利，数据已标准化。但是，现在我想将响应变量添加回“modelSTAN”。我已经看过一些关于 dplyr、merge-functions 和 rbind 的帖子，但我不能完全开始工作，因此响应只会作为最后一列添加回我的“modelSTAN”。

有没有人对此有一个好的解决方案，或者可能是另一种解决方法来标准化它而不首先删除响应变量？

我对 R 很陌生，因为我是一名金融专业的学生，并将 R 作为选修课。

score 2 · Accepted Answer

如果要将列添加model$Default到modelSTAN数据框中，可以这样做

# assign the column directly
modelSTAN$Default <- model$Default
# or use cbind for columns (rbind is for rows)
modelSTAN <- cbind(modelSTAN, model$Default)

但是，您根本不需要删除它。这是一个替代方案：

modelSTAN <- model 
## get index of response, here named default
resp <- which(names(modelSTAN) == "default")
## standardize all the non-response columns
means <- colMeans(modelSTAN[-resp])
sds <- apply(modelSTAN[-resp], 2, sd)
modelSTAN[-resp] <- scale(modelSTAN[-resp], center = means, scale = sds)

如果您有兴趣dplyr：

library(dplyr)
modelSTAN <- model %>%
  mutate(across(-all_of("default"), scale))

请注意，在dplyr我没有费心保存原始方法和 SD 的版本中，如果您想稍后进行反向转换，您仍然应该这样做。默认情况下，scale将使用meanand sd。

r - 在将响应变量删除以进行标准化后，如何将其重新合并到数据框中？

1 回答 1

Related

Reference