3

现在我有一个如下所示的数据集:

> data
             a       b        c         d
[1,] 0.5943590 2.195610 0.5332164 1.3004142
[2,] 0.7635876 1.917823 0.9714945 1.3251010
[3,] 0.9942722 2.350122 1.2048159 1.1675700
[4,] 0.3736785 1.876318 0.9109197 0.8520509

然后我想对每两列使用一个函数,例如,

F2<- function(x,y) (sum((x - y) ^ 2)) #define function
F2(data$a, data$b) #use function for first two columns
F2(data$a, data$c) #use function for first and third columns
F2(data$b, data$c) #use function for second and third columns
..................

如何使用应用家庭来做到这一点?任何帮助是极大的赞赏。

4

1 回答 1

7

这是一份工作combn

#some data
set.seed(42)
m <- matrix(rnorm(16),4)

F2<- function(x,y) (sum((x - y) ^ 2))

res <- matrix(NA, ncol(m), ncol(m))

res[lower.tri(res)] <- combn(ncol(m), 2, 
                             FUN=function(ind) F2(m[,ind[1]], m[,ind[2]]))

print(res)

#          [,1]     [,2]     [,3] [,4]
# [1,]       NA       NA       NA   NA
# [2,] 2.992875       NA       NA   NA
# [3,] 4.293073 8.320698       NA   NA
# [4,] 7.944818 6.484424 16.44946   NA

#for nicer printing
as.dist(res)

#           1         2         3
# 2  2.992875                    
# 3  4.293073  8.320698          
# 4  7.944818  6.484424 16.449463

当然,对于这个特定的功能,您应该更好地使用dist,它针对这种距离计算进行了优化:

dist(t(m))^2

#           1         2         3
# 2  2.992875                    
# 3  4.293073  8.320698          
# 4  7.944818  6.484424 16.449463
于 2013-09-01T09:05:16.110 回答