data-mining - 如何在 K 中优化 K - 均值算法

Question

可能重复：
使用 k-means 聚类时如何确定 k？

如果我不了解数据，我如何最初选择 K？

有人可以帮我选择K吗？

谢谢纳文

score 0 · Accepted Answer

基本思想是评估样本数据上的聚类评分，通常是聚类内的距离和聚类之间的距离。此度量越多，聚类效果越好，基于此度量，您可以选择最佳聚类参数。可以在此处找到其中一项指标http://alias-i.com/lingpipe/docs/api/com/aliasi/cluster/ClusterScore.html

score -8 · Accepted Answer

Seriously, what do you want to know? Do you want us to tell you some number? Or a strategy how to find the optimal k? You have to read a book or other resources about k-means, I'm pretty sure it is covered there.

There is something on Wikipedia about it:

http://en.wikipedia.org/wiki/Determining_the_number_of_clusters_in_a_data_set

Before you use an algorithm, read about it.

data-mining - 如何在 K 中优化 K - 均值算法

2 回答 2

Related

Reference