I have a large image in numpy array form (opencv returns it as a 2d array of 3 uint8 values) and want to compute a sum of gaussian kernels for each pixel, i.e. (there's still no LaTeX support in SO is there?): density function

for N different kernels with a specified weight w, mean and diagonal covariance matrix.

So basically I want a function compute_densities(image, kernels) -> numpy array of floats. What's the best way to do this efficiently in python? I'd be surprised if there wasn't already a library function in scipy for this, but I had statistics at uni a long time ago, so I do get a bit confused with the details of the documentation..

Basically I want the following, just way more efficient than naive python (2pi^{-3/2} is ignored since it's a constant factor that doesn't matter for me since I'm only interested in ratios between the probabilities)

def compute_probabilities(img, kernels):
    np.seterr(divide='ignore') # 1 / covariance logs an error otherwise
    result = np.zeros((img.shape[0], img.shape[1]))
    for row_pos, row_val in enumerate(img):
        for col_pos, val in enumerate(row_val):
            prob = 0.0
            for kernel in kernels:
                mean, covariance, weight = kernel
                val_sub_mu = np.array([val]).T - mean
                cov_inv = np.where(covariance != 0, 1 / covariance, 0)
                tmp = val_sub_mu.T.dot(cov_inv).dot(val_sub_mu)
                prob += weight / np.sqrt(np.linalg.norm(covariance)) * \
                        math.exp(-0.5 * tmp)
            result[row_pos][col_pos] = prob
    return result

Input: cv2.imread on some jpg, which gives a 2d array (height x width) of a 3 uint8 struct containing the 3 color channels.

Kernels is a namedtuple('Kernel', 'mean covariance weight'), mean is a vector, covariance is a 3x3 matrix with everything but the diagonal being zero and weight is a float 0 < weight < 1. For simplicity I only specify the diagonals and then convert it to a 3x3 matrix afterwards: (the representation isn't set in stone I don't care how it's represented so be free to change all of that):

some_kernels = [
   Kernel(np.array([(73.53, 29.94, 17.76)]), np.array([(765.40, 121.44, 112.80)]), 0.0294),

def fixup_kernels(kernels):
    new_kernels = []
    for kernel in kernels:
        cov = np.zeros((3, 3))
        for pos, c in enumerate(kernel.covariance[0]):
            cov[pos][pos] = c
        new_kernels.append(Kernel(kernel.mean.T, cov, kernel.weight))
    return new_kernels

 some_kernels = fixup_kernels(some_kernels)
 img = cv2.imread("something.jpg")
 result = compute_probabalities(img, some_kernels)

def compute_probabilities_fast(img, kernels):
    result = np.zeros((img.shape[0], img.shape[1]))
    for kernel in kernels:
        mean, covariance, weight = kernel
        cov_inv = np.where(covariance != 0, 1 / covariance, 0)
        mean = mean[:,0]
        img_sub_mu = img - mean
        img_tmp = np.sum( img_sub_mu.dot(cov_inv) * img_sub_mu, axis=2 )
        result += (weight / np.sqrt(np.linalg.norm(covariance))) * np.exp(-0.5 * img_tmp)
    return result


mean[:,0]使形状简单地 (3,) 而不是 (3,1)。

img - mean广播到整个图像并从每个像素中减去平均值。


np.sum( ... * img_sub_mu, axis=2 )大致相当于.dot(val_sub_mu)。但是,不能使用点,因为这样做会增加额外的尺寸。例如,用数组 M x K x N 点缀的数组 M x N x K 会产生结果 M x N x M x N,点在一维和多维数据上的行为不同。所以我们只做一个元素乘法,然后沿着最后一个维度求和。


PS1 / covariance有问题。你确定你不想要np.linalg.inv(covariance)吗?





这个问题有点令人困惑,你是想计算一堆用不同高斯卷积的图像,还是用高斯总和卷积的单个图像?你的内核是可分离的吗?(如果是,请使用两个卷积 Mx1 和 1xN 而不是一个 MxN)您使用的 scipy 函数在任何情况下都是相同的。

当然,您还想使用 和 的组合预先计算您的numpy.random.normal内核meshgrid

[目前(2013-11-20)您的问题和@Alex I 的答案中的代码存在错误-上述等式中的| |周围\Sigma实际上表示行列式而不是向量范数-参见例如here。在对角协方差的情况下,行列式只是对角元素的乘积。]

就 numpy 数组操作而言,可以非常有效地实现密度计算。以下实现利用了问题中协方差矩阵的球形(即对角线)性质:

def compute_probabilities_faster(img, kernels):
  means, covs, weights = map(np.dstack, zip(*kernels)) 
  pixels_as_rows = img.reshape((-1, 3, 1))
  responses = np.exp(-0.5 * ((pixels_as_rows - means) ** 2 / covs).sum(axis=1))
  factors = 1. / np.sqrt(covs.prod(axis=1) * ((2 * np.pi) ** 3))
  return np.sum(responses * factors * weights, axis=2).reshape(img.shape[:2])

该函数直接在内核上操作,因为它们最初表示,即。无需修改您的fixup_kernels功能。当规范化因子(2 * np.pi) ** 3被移除(并且调用linalg.norm被替换为linalg.det)时,此函数与您的代码的输出相匹配(足以满足np.allclose)。

SciPy 中最接近的开箱即用功能(截至 0.13)是 scipy.stats 中内核密度估计的实现(参见此处),它定义了一个非常相似的分布,其中每个内核的协方差矩阵都是相同的 - 为此原因不适合您的问题。

从 Python 获得性能的方法是不使用 Python。

有许多使用 Python 语法的包,但随后使用 C 或 C++ 后端。NumPy 本身就是这样做的。您的问题似乎是为Cythonnumexpr等量身定制的。这两个链接都向您展示了如何将任一系统用于 NumPy 向量上的内核。

编辑:我希望我的一位反对者让我知道我错了。如果找不到预制功能,我建议采取一种方法。如果您知道一种比 Cython 或 numexpr(即用 Python 语法编写 C 的方法)具有更高性能的方法,那么我很想听听。

