algorithm - Weka中K-means算法的不同结果

Question

如果我使用 Weka 中的任何算法，我会得到以下格式的结果：

=== Stratified cross-validation ===
=== Summary ===

Correctly Classified Instances         302               63.3124 %
Incorrectly Classified Instances       175               36.6876 %
Kappa statistic                          0.3536
Mean absolute error                      0.3464
Root mean squared error                  0.4176
Relative absolute error                 85.5832 %
Root relative squared error             92.8684 %
Total Number of Instances              477     

=== Detailed Accuracy By Class ===

           TP Rate   FP Rate   Precision   Recall  F-Measure   ROC Area  Class
             0.801     0.407      0.686     0.801     0.739      0.659    1
             0.748     0.243      0.549     0.748     0.633      0.718    2
             0         0          0         0         0          0.478    3
Weighted Avg.    0.633     0.283      0.516     0.633     0.568      0.641

=== Confusion Matrix ===

     a   b   c   <-- classified as
   201  50   0 |   a = 1
    34 101   0 |   b = 2
    58  33   0 |   c = 3

但是，如果我使用 k-means，我的结果将采用以下格式：

=== Model and evaluation on training set ===


kMeans
======

Number of iterations: 9
Within cluster sum of squared errors: 297.46622082142716
Missing values globally replaced with mean/mode

Cluster centroids:
                            Cluster#
Attribute        Full Data         0         1         2
                     (477)     (136)     (172)     (169)
========================================================
Religion            8.6939    7.6691    8.9709    9.2367
Vote_Criterion      2.7736    2.8971    2.4942    2.9586
Sex                 1.4906    1.4559         2         1
DateBirth        1930.7652 1937.5147 1920.2965 1935.9882
Educ                3.2201    3.2721    3.2209    3.1775
Immigrant           1.6415    1.6838    1.5872    1.6627
Income              2.4675       2.5    2.5523     2.355
Occupation          3.6184    3.8162    3.2907    3.7929
Vote2013                 1         2         1         1




Time taken to build model (full training data) : 0.06 seconds

=== Model and evaluation on training set ===

    Clustered Instances

    0       136 ( 29%)
    1      172 ( 36%)
    2      169 ( 35%)

..但是我想知道正确分类的实例、精度、召回率等，因为其他算法向我展示了这种情况。为什么会发生这种情况，我怎样才能让 weka 以 k-means 的第一种格式向我展示结果？

score 1 · Accepted Answer

K-Means 本身就是一种聚类算法：

聚类分析或聚类是对一组对象进行分组的任务，使得同一组（称为集群）中的对象彼此之间（在某种意义上）比与其他组（集群）中的对象更相似（在某种意义上）

所以它没有“类”的概念，因此不用于分类（当然可以这样做，但性能可能不太好）。你确定你在这里正确使用它吗？

另外，请参见此处（粗体是我的）：

您可以使用元分类器ClassificationViaClustering以便在受监督的环境中使用聚类器。

score 0 · Accepted Answer

在这种情况下，ClassificationViaClustering可以使用元分类器。在 WEKA 3.8 中，它必须通过包管理器单独下载。希望这可以帮助。

algorithm - Weka中K-means算法的不同结果

2 回答 2

Related

Reference