python - 基于分类列 python/matplotlib 绘制/更改 Kmeans 簇的形状

Question

我有一个数据框 DF，它看起来像这样：

    col1 col2 col3         test_results
    (Some discrete values) Ok
                           Ok
                           NOK
                           Finished
     .....                 NOK

现在我已经应用 PCA 来减少维度，这将是一个名为 reduce_cr 的 numpy 数组，之后我应用 KMeans 并找到 3 个集群解决方案作为最佳解决方案，并将集群绘制如下：

  plt.scatter(reduced_cr[:,0],reduced_cr[:,1],c=km_3_new.labels_,cmap='Spectral',alpha=0.5)
  plt.show()

但是现在我想根据我的 DF 中的分类列 test_results 更改这些散点的形状。这里的问题是我的 reduce_cr 是一个 numpy 数组，所以我不能用它来改变散点的形状，或者有可能吗？

我认为的其他方式是以某种方式使用集群标签，所以我在我的 DF 中添加了一个集群标签列作为“cluster_3”，是否可以根据分类列 test_results 绘制集群以及更改集群的形状

score 0 · Accepted Answer

这个绘图程序，每个组都用自己的参数单独绘制，是否回答了你的问题？我没有你的数据，所以我只是发明了数据和组索引（在你的问题 km_3.labels 中）。您可以将标签用作索引，如下所示。

import matplotlib.pyplot as plt
import numpy as np

# generating some data:
reduced_cr = np.random.rand(5,2)

# generating indices (you'll use K-means groups):
inds1 = [0,2]
inds2 = [1,3,4]

plt.scatter(reduced_cr[inds1,0], reduced_cr[inds1,1],cmap='Spectral',alpha=0.5,marker='o')

plt.scatter(reduced_cr[inds2,0], reduced_cr[inds2,1],cmap='Spectral',alpha=0.7,marker='+', c='r')

编辑：要从标签列表中获取 inds，您可以执行以下操作：

label = ['OK','OK','Finished','NOk','OK','NOk'] # inventing labels, I don't have your data
colordict = {'OK': 'red', 'NOk': 'blue', 'Finished': 'green'}

fig, ax = plt.subplots(1)
for i in range(3):
     inds = [j for j, x in enumerate(label) if x == label[i]]
     ax.scatter(reduced_cr[inds,0], reduced_cr[inds,1], c = colordict[label[i]])

python - 基于分类列 python/matplotlib 绘制/更改 Kmeans 簇的形状

1 回答 1

Related

Reference