1

我正在使用 python sklearn.cluster 进行聚类。我有 61 个数据,每个数据的维度为 26。原始数据:

UserID  Communication_dur   Lifestyle_dur   Music & Audio_dur   Others_dur  Personnalisation_dur    Phone_and_SMS_dur   Photography_dur Productivity_dur    Social_Media_dur    System_tools_dur    ... Music & Audio_Freq  Others_Freq Personnalisation_Freq   Phone_and_SMS_Freq  Photography_Freq    Productivity_Freq   Social_Media_Freq   System_tools_Freq   Video players & Editors_Freq    Weather_Freq
1   63  219 9   10  99  42  36  30  76  20  ... 2   1   11  5   3   3   9   1   4   8
2   9   0   0   6   78  0   32  4   15  3   ... 0   2   4   0   2   1   2   1   0   0


from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA 

Sc = StandardScaler()
X = Sc.fit_transform(df)

我已将 PCA 应用于数据帧,以便基于 K-means 绘制集群。

pca = PCA(3) 
pca.fit(X) 
pca_data = pd.DataFrame(pca.transform(X)) 
print(pca_data.head())

数据 :

    0  1  2
 0  8 -4  5
 1 -2 -2  1
 2  1  1 -0
 3  2 -1  1
 4  3 -1 -3
kmeans_pca=KMeans(n_clusters=10,init="k-means++",random_state=42)
kmeans_pca.fit (pca_data)

现在我想绘制结果集群我该怎么办?

4

1 回答 1

3

尚未测试,但可以使用如下代码进行可视化:

import matplotlib.pyplot as plt
import seaborn as sns

def show_clusters(data, labels):
     palette = sns.color_palette('hls', n_colors=len(set(labels)))
     sns.scatterplot(x=data.iloc[:, 0], y=data.iloc[:, 1], hue=labels, palette=palette)
     plt.axis('off')
     plt.show()

然后通过传递 PCA 数据和 K-means 集群标签来调用函数:

show_clusters(pca_data, kmeans_pca.labels_)

输出: 集群可视化

于 2021-02-12T11:48:41.877 回答