r - 根据列表中的名称获取平均值的简单方法

Question

是否有任何简单的方法可以根据名称获取列表中项目的平均值？示例数据集：

sampleList <- list("a.1"=c(1,2,3,4,5), "b.1"=c(3,4,1,4,5), "a.2"=c(5,7,2,8,9), "b.2"=c(6,8,9,0,6))
sampleList
$a.1
[1] 1 2 3 4 5

$b.1
[1] 3 4 1 4 5

$a.2
[1] 5 7 2 8 9

$b.2
[1] 6 8 9 0 6

我正在尝试做的是在类似但不同名的行之间获取列平均值，输出一个列表，其中包含a'sand的列平均值b's。目前我可以执行以下操作：

y <- names(sampleList)
y <- gsub("\\.1", "", y)
y <- gsub("\\.2", "", y)
y <- sort(unique(y))
sampleList <- t(as.matrix(as.data.frame(sampleList)))
t <- list()
for (i in 1:length(y)){
   temp <- sampleList[grep(y[i], rownames(sampleList)),]
   t[[i]] <- apply(temp, 2, mean)
}

t
[[1]]
[1] 3.0 4.5 2.5 6.0 7.0

[[2]]
[1] 4.5 6.0 5.0 2.0 5.5

AI 有一个包含大量相似名称集的大型数据集，有没有更简单的方法来解决这个问题？

编辑：我已将名称问题分解为一个单独的问题。可以在这里找到

score 6 · Accepted Answer

嗯，这更短了。你没有确切地说你的实际数据有多大，所以我不会做出任何承诺，但是这个性能应该不会很糟糕：

dat <- do.call(rbind,sampleList)
grp <- substr(rownames(dat),1,1)

aggregate(dat,by = list(group = grp),FUN = mean)

（编辑以删除不必要的数据框转换，这可能会导致显着的性能损失。）

如果您的数据非常大，甚至只是中等大，但组的数量相当大，因此每组中的向量数量很少，标准建议是在data.table您rbind将数据编入矩阵后进行调查.

score 4 · Accepted Answer

我可能会做这样的事情：

# A *named* vector of patterns you want to group by
patterns <- c(start.a="^a",start.b="^b",start.c="^c")
# Find the locations of those patterns in your list
inds <- lapply(patterns, grep, x=names(sampleList))
# Calculate the mean of each list element that matches the pattern
out <- lapply(inds, function(i) 
  if(l <- length(i)) Reduce("+",sampleList[i])/l else NULL)
# Set the names of the output
names(out) <- names(patterns)

r - 根据列表中的名称获取平均值的简单方法

2 回答 2

Related

Reference