r - R-Script DeNormalising 数据帧和 NA 的中值替换

Question

我正在将 csv 文件中的一组主题响应时间读入数据框中，需要对其进行反规范化，如下所示：

将所有列折叠为两列，然后
将 NA 和零值替换为原始响应时间的中位数。

实际输入：

主题,1,2,3,4,5
阿尔法,97,98,99,100,101
测试版,102,103,NA,104,0.00
伽玛,105,NA,NA,NA,NA

预期输出：

主题反应
阿尔法 97
阿尔法 98
阿尔法 99
阿尔法 100
阿尔法 101
测试版 102
测试版 103
Beta 101 # 中位数
测试版 104
Beta 101 # 中位数
伽玛 105
伽玛 101 # 中位数
伽玛 101 # 中位数
伽玛 101 # 中位数
伽玛 101 # 中位数

我已经部分使用：

input <- read.csv("rt.csv", header = TRUE, sep = ",")
names(input) <- tolower(names(input))

response <- input[setdiff(names(input), names(input[1]))]
cntCols  <- ncol(response)
y <- response[[1]]
for (i in 2:cntCols) {
    y = c(y, response[[i]])
}
extract <- as.data.frame(y)

wip <-
  data.frame(
    x = rep(c(levels(input[[1]]))),
    y = extract
  )

wip <- wip[order(wip[,1]),]

mdnInputY <- median(wip$y, na.rm = TRUE)
MedianReplace <- function(dfInput) {
  dfInput[is.na(dfInput)] <- mdnInputY
  dfInput[trimws(dfInput) == 0] <- mdnInputY
  return(dfInput)
}

output <- data.frame(apply(wip, 2, MedianReplace))

但是，它在一点上失败了：

不是惯用的（矢量化的）。

请指教？

score 0 · Accepted Answer

使用aggregate从{stats}.

aggregate(x = input['V2'], by = input['V1'], FUN =  paste, collapse =', ')
aggregate(formula = V2 ~ V1, data = input, FUN =  paste, collapse =', ')

r - R-Script DeNormalising 数据帧和 NA 的中值替换

1 回答 1

Related

Reference