0

这可能相对简单。我有一个巨大的数据框,如下所示:

df1 <- structure(list(place = structure(c(1L, 5L, 1L, 4L), .Label = c("1","2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23","24", "25", "26"), class = "factor"), x = structure(list(c("A", "B", "C", "D", "E"), c("D", "E", "F","G", "H", "I"), c("D", "E", "F", "G", "H"), c("F", "H")), class = "AsIs")), .Names = c("place", "x"), row.names = c(1L, 2L, 3L, 4L), class = "data.frame")

> df1
  place            x
1     1 A, B, C,....
2     5 D, E, F,....
3     1 D, E, F,....
4     4         F, H

另一个对中的每个列表元素都有相应的值df1

df2 <- structure(list(x = c('A','B','C','D','E','F','G','H','I','J','K','L','M'), value = c("5.2", "1.8", "2.7","3.8", "5.0","3.2", "4.5","2.4", "3.9", "1.2","2.3","4.3", "3.0")), .Names = c("x", "value"), row.names = c(1L,2L,3L,4L,5L,6L,7L,8L,9L,10L, 11L, 12L, 13L), class = "data.frame")

   x value
1  A   5.2
2  B   1.8
3  C   2.7
4  D   3.8
5  E   5.0
6  F   3.2
7  G   4.5
8  H   2.4
9  I   3.9
10 J   1.2
11 K   2.3
12 L   4.3
13 M   3.0

我想df1用它们对应的值替换 in 中的元素df2(所以对于每个Aindf1应该是5.2等等),然后执行操作,例如x使用这些值的每个位置的平均值。谢谢!

4

2 回答 2

2

如果数据集较大,则使用 qdaplookup函数的环境查找可能有用:

library(qdap)
lapply(df1[, 2], lookup, df2)

或获得手段

df2$value <- as.numeric(df2$value) #convert your df2 value column to numeric
sapply(df1[, 2], function(x) mean(lookup(x, df2)))
于 2013-06-24T20:45:54.193 回答
1

您可以使用matchsapply

df1$x <- sapply(df1$x, function(x) df2$value[match(x, df2$x)])

df1$x
# [[1]]
# [1] "5.2" "1.8" "2.7" "3.8" "5.0"
#
# [[2]]
# [1] "3.8" "5.0" "3.2" "4.5" "2.4" "3.9"
#
# [[3]]
# [1] "3.8" "5.0" "3.2" "4.5" "2.4"
#
# [[4]]
# [1] "3.2" "2.4"

每条评论:

要平均每一行,您可以sapply再次使用:

sapply(df1$x, mean)

或者一步:

sapply(df1$x, function(x) mean(df2$value[match(x, df2$x)]))
于 2013-06-24T20:05:03.363 回答