python - 更快地分析图像中的每个子窗口的方法？

Question

我正在尝试计算图像中子窗口的熵特征。这是我写的代码：

  def genHist(img):
    hist = np.histogram(img, np.arange(0, 256), normed=True)
    return hist[0]

  def calcEntropy(hist):
    logs = np.nan_to_num(np.log2(hist))
    hist_loghist = hist * logs
    entropy = -1 * hist_loghist.sum()
    return entropy  

   img = cv2.imread("lena.jpg", 0)
   result = np.zeros(img.shape, dtype=np.float16)
   h, w = img.shape
   subwin_size = 5
   for y in xrange(subwin_size, h-subwin_size):
       for x in xrange(subwin_size, w-subwin_size):
           subwin = img[y-subwin_size:y+subwin_size, x-subwin_size:x+subwin_size]
           hist = genHist(subwin)         # Generate histogram
           entropy = calcEntropy(hist)    # Calculate entropy
           result[y, x] = entropy

实际上，它有效。但问题是它的速度，太慢了。你有什么想法让它快吗？

score 2 · Accepted Answer

您可以进行一些修改以使其更快。

您的代码在我的笔记本电脑中需要以下时间：

IPython CPU timings (estimated):
  User   :      50.92 s.
  System :       0.01 s.
Wall time:      51.20 s.

我做了以下修改：

1 - 删除该功能genHist并在内部实现calcEntropy()。它将保存，可能是 1 或 2 秒。

2 -logs = np.nan_to_num(np.log2(hist))在查找日志之前，我只是在 hist 中添加了一个小值 0.00001 而不是。logs = np.log2(hist+0.00001). 它会保存3-4 seconds，但它会稍微改变你的输出。我在两个结果之间得到的最大错误是0.0039062. （所以这取决于你是否想要这个）

3 - 更改np.histogram为cv2.calcHist(). 将节省超过25 seconds.

现在，代码在我的笔记本电脑上需要以下时间：

IPython CPU timings (estimated):
  User   :      13.38 s.
  System :       0.00 s.
Wall time:      13.41 s.

它的速度提高了 3 倍以上。

代码：

def calcEntropy(img):
    #hist,_ = np.histogram(img, np.arange(0, 256), normed=True)
    hist = cv2.calcHist([img],[0],None,[256],[0,256])
    hist = hist.ravel()/hist.sum()
    #logs = np.nan_to_num(np.log2(hist))
    logs = np.log2(hist+0.00001)
    #hist_loghist = hist * logs
    entropy = -1 * (hist*logs).sum()
    return entropy  

img = cv2.imread("lena.jpg", 0)
result2 = np.zeros(img.shape, dtype=np.float16)
h, w = img.shape
subwin_size = 5
for y in xrange(subwin_size, h-subwin_size):
   for x in xrange(subwin_size, w-subwin_size):
       subwin = img[y-subwin_size:y+subwin_size, x-subwin_size:x+subwin_size]
       #hist = genHist(subwin)         # Generate histogram
       entropy = calcEntropy(subwin)    # Calculate entropy
       result2.itemset(y,x,entropy)

现在主要的问题是two for loops。我认为它是Cython实施的最佳候选者，它将产生非常好的结果。

score -1 · Accepted Answer

作为第一步，您应该尝试使用math.log而不是相应的numpy功能，这要慢得多：

import numpy as np
import math

x=abs(randn(1000000))

#unsing numpy
start = time.time()
for i in x:
    np.log2(i)
print "Runtime: %f s" % (time.time()-start)
>>> Runtime: 3.653858 s

#using math.log
start = time.time()
for i in x:
    math.log(i,2)        # use log with base 2
print "Runtime: %f s" % (time.time()-start)
>>> Runtime: 0.692702 s

这样做的问题是每次遇到math.log都会产生错误。您可以通过从直方图输出中0删除所有内容来绕过此问题。0这有几个优点：1）数学。日志不会失败，2）根据您的图像，math.log将被调用更少，这会导致更快的代码。您可以删除零，因为即使0*log(0)会返回一个值。因此，该产品不会添加到熵的总和中。0log(0)

我在处理一些音频时也遇到了同样的问题。不幸的是，我无法改进它超出上述范围。如果您找到更好的解决方案，如果您将其发布在此处，我将非常高兴。

python - 更快地分析图像中的每个子窗口的方法？

2 回答 2

Related

Reference