2

I am working on a clustering algorithm and need for all points in my scatter plot that belong to the same cluster to be marked the same color. I have a list which indicates for each point which cluster that point belongs to, marked with an integer 0...k where k is the number of clusters. I would like to know how to map this list to colors (preferably as many colors as the number of clusters in the clustering algorithm which is known beforehand). I am working with matplotlib in python and am completely lost as to how to solve this problem.

plt.scatter([item[0] for item in dataset],[item[1] for item in dataset],color='b')
plt.scatter([item[0] for item in centroids_list],[item[1] for item in centroids_list],color='r)

plt.show()

Right now this is all I have wherein the cluster points are indicated in blue and the centroids in red. I would like to leave the centroids red and only change the color of the points in the dataset such that points of the same cluster have the same color. I am lost in the sea that is the matplotlib library and would really appreciate any help.

Thanks in advance!

4

3 回答 3

1

请参阅pyplot.scatter 文档中的颜色参数。

基本上,您需要将数据分成集群,然后在循环中调用 pyplot.scatter ,每个都有不同的项目作为颜色参数。

您可以使用 scipy.cluster 中的 vq 使用您的质心将数据分配给集群,如下所示:

    assignments = vq( dataset, centroids_list )[0]
    clusters = [[] for i in range( len( assignments ) )
    for item, clustNum in zip( dataset, assignments ):
        clusters[clustNum].append( item )

如果我没记错的话,至少我以前是这样做的。从那里它只是定义一个返回随机颜色的函数,然后:

    for cluster in clusters:
        plt.scatter([item[0] for item in cluster],[item[1] for item in cluster],color=randomColor() ) 
于 2013-11-08T02:07:59.263 回答
1

如果你使用 numpy 数组,你可以简化切片,如果你传递给colorparam clusters label 它应该可以正常工作:

plt.scatter(item[:, 0], item[:, 1], color=clusters)
plt.scatter(centroids_list[:, 0], centroids_list[:, 1], s=70, c='r')

并且您可以使用meshgridwithplt.imshow来添加创辉背景,如此处的示例

于 2015-12-18T16:04:25.800 回答
0

如果您有numpy数组,您应该能够dataset[:,0]更有效地访问第一列。

我发现scatter有时表现得很奇怪(至少在 ipython notebook 中),但该plot函数也可以做到这一点。

i = 0
markers = matplotlib.lines.Line2D.markers.keys()
colors = list("bgrcmyk")
for cluster in clusters:
  marker, color = markers[i % len(markers)], colors[i % len(colors)]
  plt.plot(cluster[:,0],cluster[:,1],marker+color)
  i += 1
于 2013-11-08T08:55:24.900 回答