我对数据应用了 K_Mean 聚类,并在应用 TSNE 绘制数据之后。我有 4 个维度和 4 个组。问题是我的 K_mean 是正确的,但为什么使用 tsne,同一组不在一起?
the code :
XX = df [["agent_os_new","agent_category_new","referer_new","agent_name_new"]]
y = df['referer_new']
y
cols = XX.columns
from sklearn.preprocessing import MinMaxScaler
ms = MinMaxScaler()
X = ms.fit_transform(XX)
X = pd.DataFrame(X, columns=[cols])
X[:4]
from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters=4, random_state=0)
ymeans = kmeans.fit(X)
ymeans
labels = kmeans.labels_
df_new = XX.assign(Cluster =labels)
df_new
from sklearn.manifold import TSNE
import seaborn as sns
X_embedded = TSNE(n_components=2).fit_transform(df_new)
df_subset = pd.DataFrame()
df_subset['tsne1'] = X_embedded[:,0]
df_subset['tsne2'] = X_embedded[:,1]
plt.figure(figsize=(16,10))
sns.scatterplot(
x="tsne1", y="tsne2",
hue=df.label,
palette="Set1",
data=df_subset,
style=df_new["Cluster"],
legend="full",
s=120
)
我想要的是: