r - 转换数据集（相似度评级）

Question

我想转换以下数据格式（简化表示）：

  image1 image2 rating
1      1      2      6
2      1      3      5
3      1      4      7
4      2      3      3
5      2      4      5
6      3      4      1

转载：

structure(list(image1 = c(1, 1, 1, 2, 2, 3), image2 = c(2, 3, 
4, 3, 4, 4), rating = c(6, 5, 7, 3, 5, 1)), .Names = c("image1", 
"image2", "rating"), row.names = c(NA, -6L), class = "data.frame")

以一种格式，您可以获得一种相关矩阵，其中前两列作为指标，评级是值：

   1  2  3  4
1 NA  6  5  7
2  6 NA  3  5
3  5  3 NA  1
4  7  5  1 NA

你们中有人知道 R 中有一个函数可以做到这一点吗？

score 4 · Accepted Answer

我宁愿使用矩阵索引：

N <- max(dat[c("image1", "image2")])
out <- matrix(NA, N, N)
out[cbind(dat$image1, dat$image2)] <- dat$rating
out[cbind(dat$image2, dat$image1)] <- dat$rating

#      [,1] [,2] [,3] [,4]
# [1,]   NA    6    5    7
# [2,]    6   NA    3    5
# [3,]    5    3   NA    1
# [4,]    7    5    1   NA

score 3 · Accepted Answer

我不太喜欢<<-操作员，但它适用于这个（命名你的结构s）：

N <- max(s[,1:2])
m <- matrix(NA, nrow=N, ncol=N)
apply(s, 1, function(x) { m[x[1], x[2]] <<- m[x[2], x[1]] <<- x[3]})

 > m
     [,1] [,2] [,3] [,4]
[1,]   NA    6    5    7
[2,]    6   NA    3    5
[3,]    5    3   NA    1
[4,]    7    5    1   NA

不像 Karsten 的解决方案那样优雅，但它不依赖于行的顺序，也不要求所有组合都存在。

score 1 · Accepted Answer

这是一种方法，dat问题中定义的数据框在哪里

res <- matrix(0, nrow=4, ncol=4) # dim may need to be adjusted
ll <- lower.tri(res, diag=FALSE)
res[which(ll)] <- dat$rating
res <- res + t(res)
diag(res) <- NA

这仅在行按问题排序时才有效。

r - 转换数据集（相似度评级）

3 回答 3

Related

Reference