0

我只是想知道 cmeans 函数 [在包 e1071 中] 是否有一种方法可以使用马氏距离执行聚类?

非常感谢

4

1 回答 1

2

The e1071 package does not have a mahalanobis option. However, you can look into the cluster package and the fanny function. As per the help page, it also computes a fuzzy clustering of the data into k-clusters. With this function, you can provide your own distance matrix.

So for mahalanobis distance, you can calculate your distance matrix with dist and then run your clustering.

require(cluster)
set.seed(123)
x<-rbind(matrix(rnorm(100,sd=0.3),ncol=2),
         matrix(rnorm(100,mean=1,sd=0.3),ncol=2))
y <- dist(x, "mahalanobis")
fanny(y, k=2)

Given your understandable concerns over equivalence between the functions here is an example comparing them:

require(e1071)
cl<-cmeans(x,centers=2,iter.max=20,dist="euclidean",method="cmeans",m=2)
fl <- fanny(x, k=2, maxit=20, metric="SqEuclidean", memb.exp=2)

> head(cl$membership)
             1           2
[1,] 0.9948729 0.005127121
[2,] 0.3647778 0.635222221
[3,] 0.9290126 0.070987385
[4,] 0.7588260 0.241174043
[5,] 0.9282550 0.071745007
[6,] 0.9599231 0.040076886
> head(fl$membership)
          [,1]        [,2]
[1,] 0.9948722 0.005127775
[2,] 0.3647890 0.635211040
[3,] 0.9290171 0.070982905
[4,] 0.7588304 0.241169649
[5,] 0.9282575 0.071742489
[6,] 0.9599221 0.040077878

Although not absolutely identical, you can see there are very close. You will also notice that fanny is specifying the squared euclidean distance which is what cmeans is doing. This equivalence is noted on the fanny help page ?fanny under metric.

于 2014-09-29T15:13:06.567 回答