r - 使用 R 对大型数据矩阵进行聚类

Question

我有一个大数据矩阵（33183x1681），每一行对应一个观察值，每一列对应于变量。

我在 R 中使用 PAM 函数应用了 K-medoids 聚类，并尝试使用 PAM 函数提供的内置图来可视化聚类结果。我收到了这个错误：

Error in princomp.default(x, scores = TRUE, cor = ncol(x) != 2) :
cannot use cor=TRUE with a constant variable

我认为这个问题是因为我试图聚类的数据矩阵的高维性。

任何想法/想法如何解决这个问题？

score 6 · Accepted Answer

查看所有 R 版本附带clara()的包cluster中的函数。

library("cluster")
## generate 500 objects, divided into 2 clusters.
x <- rbind(cbind(rnorm(200,0,8), rnorm(200,0,8)),
           cbind(rnorm(300,50,8), rnorm(300,50,8)))
clarax <- clara(x, 2, samples=50)
clarax

> clarax
Call:    clara(x = x, k = 2, samples = 50) 
Medoids:
         [,1]       [,2]
[1,] -1.15913  0.5760027
[2,] 50.11584 50.3360426
Objective function:  10.23341
Clustering vector:   int [1:500] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ...
Cluster sizes:           200 300 
Best sample:
 [1]  10  17  45  46  68  90  99 150 151 160 184 192 232 238 243 250 266 275 277
[20] 298 303 304 313 316 327 333 339 353 358 398 405 410 411 421 426 429 444 447
[39] 456 477 481 494 499 500

Available components:
 [1] "sample"     "medoids"    "i.med"      "clustering" "objective" 
 [6] "clusinfo"   "diss"       "call"       "silinfo"    "data"

请注意，您应该详细研究clara()( ?clara) 的帮助以及引用的参考资料，以使由执行的聚类clara()与接近或相同pam()。

r - 使用 R 对大型数据矩阵进行聚类

1 回答 1

Related

Reference