python - K-means 颜色聚类 - 用掩码的 numpy 数组省略背景像素

Question

我正在尝试使用 K-means 聚类找到多个图像的 3 种主色。我面临的问题是 K-means 也会对图像的背景进行聚类。我正在使用 Python 2.7 和 OpenCV 3

所有图像都具有以下 RGB 颜色的相同灰色背景：150,150,150。为了避免 K-means 也对背景颜色进行聚类，我创建了一个蒙版数组，它从原始图像数组中屏蔽所有“150”像素值，理论上只留下数组中的非背景像素供 K-Means 使用。但是，当我运行我的脚本时，它仍然返回灰色作为主要颜色之一。

我的问题：蒙面数组是要走的路（我做错了什么）还是有更好的选择以某种方式从 K-means 聚类中排除像素？

请在下面找到我的代码：

from sklearn.cluster import KMeans
from sklearn import metrics
import cv2
import numpy as np

def centroid_histogram(clt):
    numLabels = np.arange(0, len(np.unique(clt.labels_)) + 1)
    (hist, _) = np.histogram(clt.labels_, bins=numLabels)
    hist = hist.astype("float")
    hist /= hist.sum()
    return hist

image = cv2.imread("test1.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

h, w, _ = image.shape
w_new = int(100 * w / max(w, h))
h_new = int(100 * h / max(w, h))
image = cv2.resize(image, (w_new, h_new))

image_array = image.reshape((image.shape[0] * image.shape[1], 3))
image_array = np.ma.masked_values(image_array,150)

clt = KMeans(n_clusters=3)
clt.fit(image_array)

hist = centroid_histogram(clt)
zipped = zip(hist, clt.cluster_centers_)
zipped.sort(reverse=True, key=lambda x: x[0])

hist, clt.cluster_centers = zip(*zipped)
print(clt.cluster_centers_)

score 2 · Accepted Answer

如果要提取背景以外的像素值，可以使用 numpy indexation ：

img2=image_array[image_array!=[150,150,150]]
img2=img2.reshape((len(img2)/3,3))

这将产生不是 [150,150,150] 的像素列表。
但是，它不会保留图像的结构，只会为您提供像素值列表。我真的不记得了，但也许对于 K-means 你需要给出整个图像，即你还需要给它提供像素的位置？但在这种情况下，任何遮罩都不会有帮助，因为遮罩只是用另一个像素替换某些像素的值，而不是一起摆脱像素。

python - K-means 颜色聚类 - 用掩码的 numpy 数组省略背景像素

1 回答 1

Related

Reference