haskell - Haskell中更快的SumSquareDifference

Question

我正在 Haskell 中实现二进制图像的分形图像压缩算法。为此，我必须在所谓的域池中找到给定范围块（子图像）最接近的图像，即图像列表列表。我通过计算两个像素值的平方和差来比较图像。

我使用 Haskell 图像处理 (HIP) 库来读取和写入图像。

compress :: Image VS X Bit -> Int -> [(Int, Int)]
compress img blockSize = zip dIndices tIndices
    where rImg = img
          dImg = downsample2 rImg
          rBlocks = (toBlocks rImg blockSize) :: [Image VS X Bit]
          dBlocks = (toBlocks dImg blockSize) :: [Image VS X Bit]
          dPool = (createDPool dBlocks) :: [[Image VS X Bit]]
          distanceLists = map (\x -> (map.map) (distance x) dPool) rBlocks
          dIndices = map (fst . getMinIndices) distanceLists
          tIndices = map (snd . getMinIndices) distanceLists


distance :: Image VS X Bit -> Image VS X Bit-> Int
distance x y = sumSquareDifference (toBinList x) (toBinList y)
    where toBinList = map (toNum . extractBitOfPixel) . concat . toLists

toLists :: MArray arr cs e => Image arr cs e -> [[Pixel cs e]]
toLists img = [[index img (i, j) | j <- [0..cols img -1]] | i <- [0.. rows img -1]]

extractBitOfPixel :: Pixel X Bit -> Bit
extractBitOfPixel (PixelX b) = b

sumSquareDifference :: [Int] -> [Int] -> Int
sumSquareDifference a b = sum $ zipWith (\x y -> (x-y)^2) a b

这段代码的性能真的很差。尽管使用-O2. 分析向我展示了大部分运行时都花在了函数distance中，尤其是在 sumSquareDifference，而且在toLists和toBinList：

       binaryCompressionSimple +RTS -p -RTS

    total time  =     1430.89 secs   (1430893 ticks @ 1000 us, 1 processor)
    total alloc = 609,573,757,744 bytes  (excludes profiling overheads)

COST CENTRE               MODULE    SRC                                        %time %alloc

sumSquareDifference       Main      binaryCompressionSimple.hs:87:1-63          30.9   28.3
toLists                   Main      binaryCompressionSimple.hs:66:1-90          20.3   47.0
distance.toBinList        Main      binaryCompressionSimple.hs:74:11-79         10.9   15.1
main                      Main      binaryCompressionSimple.hs:(14,1)-(24,21)    7.3    0.0
compress                  Main      binaryCompressionSimple.hs:(28,1)-(36,60)    6.9    0.0
distance                  Main      binaryCompressionSimple.hs:(71,1)-(74,79)    5.7    0.9
compress.distanceLists.\  Main      binaryCompressionSimple.hs:34:38-65          5.2    4.4
compress.distanceLists    Main      binaryCompressionSimple.hs:34:11-74          2.8    0.0
main.\                    Main      binaryCompressionSimple.hs:20:72-128         2.7    0.0
getMinIndices.getMinIndex Main      binaryCompressionSimple.hs:116:11-53         2.7    1.8
sumSquareDifference.\     Main      binaryCompressionSimple.hs:87:52-58          2.7    2.5

有没有办法提高性能？

块大小为 2 表示将 16384 个范围块与域池的 131072 个图像进行比较，因此 sumSquareDifference 将被调用 (16384*131072=)2147483648 次，每次计算两个长度为 4 的列表的平方和差。我意识到这是一个很大的数字，但代码不应该更快（懒惰评估列表）吗？这是 Haskell 问题还是算法问题？

编辑：

通过使用，我至少能够将性能提高三分之一：

distance :: Image VS X Bit -> Image VS X Bit-> Int
distance x y
     | x == y = 0
     | otherwise = sumSquareDifference (toBinList x) (toBinList y)
    where toBinList = map (toNum . extractBitOfPixel) . concat . inlinedToLists

编辑2：

dPool通过使用函数创建，我能够极大地提高性能genDistanceList，一旦找到两个相同的图像就会停止计算：

genDistanceList :: [[Image VS X Bit]] -> Image VS X Bit -> [[Int]]
genDistanceList dPool rBlock = nestedTakeWhileInclusive (/= 0) $ (map.map) (distance rBlock) dPool

score 3 · Accepted Answer

绝对要尝试的第一件事是跳过转换为列表：

{-# INLINE numIndex #-}
numIndex :: Image VS X Bit -> (Int, Int) -> Int
numIndex img pos = toNum . extractBitOfPixel $ index img pos

distance :: Image VS X Bit -> Image VS X Bit -> Int
distance a b = sum
    [ (numIndex a pos - numIndex b pos)^2
    | i <- [0 .. cols a-1]
    , j <- [0 .. rows a-1]
    , let pos = (i, j)
    ]

由于您没有为我们提供最小的可重现示例，因此无法判断会产生什么影响（如果有的话）。如果您想要更好的建议，请提供更好的数据。

编辑

通过黑线鳕寻找臀部，我怀疑以下内容会更好：

distance :: Image VS X Bit -> Image VS X Bit -> Int
distance a b = id
    . getX
    . fold (+)
    $ zipWith bitDistance a b

bitDistance :: Pixel X Bit -> Pixel X Bit -> Pixel X Int
bitDistance (PixelX a) (PixelX b) = PixelX (fromIntegral (a-b))
-- use (a-b)^2 when you switch to grayscale, but for Bit the squaring isn't needed

在这里，foldandzipWith是由hip, not提供的base。

haskell - Haskell中更快的SumSquareDifference

1 回答 1

Related

Reference