R基dissimilarity() = 1 - pmax(cor(), 0)
函数。此外,重要的是要指定method
它们两者使用相同的:
library("recommenderlab")
data(MovieLense)
cor_mat <- as( dissimilarity(MovieLense, method = "pearson",
which = "items"), "matrix" )
cor_mat_base <- suppressWarnings( cor(as(MovieLense, "matrix"), method = "pearson"
, use = "pairwise.complete.obs") )
print( cor_mat[1:5, 1:5] )
print(1- cor_mat_base[1:5, 1:5] )
> print( cor_mat[1:5, 1:5] )
Toy Story (1995) GoldenEye (1995) Four Rooms (1995) Get Shorty (1995) Copycat (1995)
Toy Story (1995) 0.0000000 0.7782159 0.8242057 0.8968647 0.6135248
GoldenEye (1995) 0.7782159 0.0000000 0.7694644 0.7554443 0.7824406
Four Rooms (1995) 0.8242057 0.7694644 0.0000000 1.0000000 0.8153877
Get Shorty (1995) 0.8968647 0.7554443 1.0000000 0.0000000 1.0000000
Copycat (1995) 0.6135248 0.7824406 0.8153877 1.0000000 0.0000000
> print(1- cor_mat_base[1:5, 1:5] )
Toy Story (1995) GoldenEye (1995) Four Rooms (1995) Get Shorty (1995) Copycat (1995)
Toy Story (1995) 0.0000000 0.7782159 0.8242057 0.8968647 0.6135248
GoldenEye (1995) 0.7782159 0.0000000 0.7694644 0.7554443 0.7824406
Four Rooms (1995) 0.8242057 0.7694644 0.0000000 1.2019687 0.8153877
Get Shorty (1995) 0.8968647 0.7554443 1.2019687 0.0000000 1.2373503
Copycat (1995) 0.6135248 0.7824406 0.8153877 1.2373503 0.0000000
要很好地理解它,请检查两个包的详细信息:)。
OP/编辑:
重要的是要指出,偶数1-dissimilarity
和之间有一些值略有不同cor
,cor
大于 1。这是因为dissimilarity()
将下限设置为 0(即不返回负数),并且也在做cor()
可能返回大于 1 的值。https://www.rdocumentation.org/packages/stats/versions/3.6.0/topics/cor 他们只指定
For r <- cor(*, use = "all.obs"), it is now guaranteed that all(abs(r) <= 1).
这应该被评估。