haskell - Haskell中可变非整数类型的可变列表

Question

我正在尝试从二进制文件中解析一个巨大的复杂值的 3d 数据数组。稍后这应该成为l矩阵 ( n x m)。由于我要处理这些矩阵，因此我仅限于矩阵库 - hmatrix 似乎很有希望。数据布局不是我要求的格式，所以我必须在 position 中跳来跳去(i,j,k) -> (k,i,j)，其中i和j是的元素n和m的k元素l。

我认为阅读这个的唯一方法是我使用可变变量，否则我最终会得到几个太字节的垃圾。我的想法是使用盒装互数组或互矩阵向量（来自 Numeric.LinearAlgebra.Devel 的 STMatrix），所以我最终得到如下结果：

data MVector s (STMatrix s t)

但我不确定如何正确使用它们：我可以通过 modify 修改 MVector 的一个元素：

modify :: PrimMonad m => MVector (PrimState m) a -> (a -> a) -> Int -> m ()

或使用 modifyM （奇怪：堆栈向量-0.12.3.0 中没有 modifyM...）

modifyM :: PrimMonad m => MVector (PrimState m) a -> (a -> m a) -> Int -> m ()

所以我可以使用函数调用(a -> a)runST 例程来修改 SMatrix。我不确定，是否应该将 ST 放入 IO（？）

尽管如此 - 我认为，这应该有效，但只有在我想修改整个矩阵时才有用，调用这个(a->a)-routine n x m x l- 时间会有点开销（也许它会被优化出来......）。所以我最终会在编组数组时通过指针修改内容(i,j,k) -> (k,i,j)并逐个矩阵读取所有内容 - 但这感觉不对，我想避免这种肮脏的技巧。

你有什么想法可以做到这一点，但更...干净？泰

编辑： 感谢 KA Buhr。他的解决方案到目前为止有效。现在，我只是遇到了一些性能影响。如果我比较解决方案：

{-# LANGUAGE BangPatterns #-}
module Main where
import Data.List
import Numeric.LinearAlgebra
import qualified Data.Vector as V
import qualified Data.Vector.Storable as VS
import qualified Data.Vector.Storable.Mutable as VSM

-- Create an l-length list of n x m hmatrix Matrices
toMatrices :: Int -> Int -> Int -> [C] -> [Matrix C]
toMatrices l n m dats = map (reshape m) $ VS.createT $ do
  mats <- V.replicateM l $ VSM.unsafeNew (m*n)
  sequence_ $ zipWith (\(i,j,k) x ->
      VSM.unsafeWrite (mats V.! k) (loc i j) x) idxs (dats ++ repeat 0)
  return $ V.toList mats

  where idxs = (,,) <$> [0..n-1] <*> [0..m-1] <*> [0..l-1]
        loc i j = i*m + j

test1 = toMatrices 1000 1000 100 (fromIntegral <$> [1..])

main = do
  let !a = test1
  print "done"

使用最简单的 C 代码：

#include <stdlib.h>
#include <stdio.h>
void main() 
{

    const int n = 1000;
    const int m = 1000;
    const int l = 100;

    double *src = malloc(n*m*l * sizeof(double));
    for (int i = 0; i < n*m*l; i++) {
        src[i] = (double)i;
    }

    double *dest = malloc(n*m*l * sizeof(double));
    for (int i = 0; i < n; i++) {
        for (int j = 0; j < m; j++) {
            for (int k = 0; k < l; k++) {
                dest[k*n*m+i*m+j] = src[i*m*l+j*l+k];
            }
        }
    }
    printf("done: %f\n", dest[n*m*l - 1]); // Need to access the array, otherwise it'll get lost by -O2
    free(src);
    free(dest);
}

使用 -O2 编译的两者都给出了以下性能猜测：

real    0m5,611s
user    0m14,845s
sys 0m2,759s

对比

real    0m0,441s
user    0m0,200s
sys 0m0,240s

这大约是每个核心性能的 2 个数量级。从分析中我了解到

      VSM.unsafeWrite (mats V.! k) (loc i j) x

是昂贵的功能。由于我将在类似分钟的间隔中使用此过程，因此我希望将解析时间保持在与磁盘访问时间一样低的时间。我会看看，如果我能加快这个速度

PS：这是一些测试，如果我可以将通常的 DSP 从 C 类移动到 Haskell

Edit2： 好的，这是我尝试总和后得到的：

{-# LANGUAGE BangPatterns #-}

module Main where

import Data.List
import qualified Data.Vector as V
import qualified Data.Vector.Storable as VS
import qualified Data.Vector.Storable.Mutable as VSM
import Numeric.LinearAlgebra

-- Create an l-length list of n x m hmatrix Matrices
toMatrices :: Int -> Int -> Int -> VS.Vector C -> V.Vector (Matrix C)
toMatrices l n m dats =
  V.map (reshape m) newMat
   where
    newMat = VS.createT $
      V.generateM l $ \k -> do
      curMat <- VSM.unsafeNew (m * n)
      VS.mapM_
        (\i ->
           VS.mapM_
             (\j -> VSM.unsafeWrite curMat (loc i j) (dats VS.! (oldLoc i j k)))
            idjs)
        idis
      return curMat
    loc i j = i * m + j
    oldLoc i j k = i * m * l + j * l + k
    !idis = VS.generate n (\a->a)
    !idjs = VS.generate m (\a->a)

test1 = toMatrices 100 1000 1000 arr
  where
    arr = VS.generate (1000 * 1000 * 100) fromIntegral :: VS.Vector C

main = do
  let !a = test1
  print "done"

它给出了一些关于：

real    0m1,816s
user    0m1,636s
sys 0m1,120s

，所以比 C 代码慢 4 倍。我想我可以忍受这个。我想，我正在用这段代码破坏向量的所有流功能。如果有任何建议可以让它们以可比的速度返回，我将不胜感激！

score 1 · Accepted Answer

据我了解，您有一组以i-major、j-middling、k-minor 顺序排列的“巨大”数据，并且您希望将其加载到k由其元素具有i-indexed 行和j-indexed 列的索引的矩阵中，对吗？所以，你想要一个类似的功能：

import Numeric.LinearAlgebra

-- load into "l" matrices of size "n x m"
toMatrices :: Int -> Int -> Int -> [C] -> [Matrix C]
toMatrices l n m dats = ...

请注意，您在n x m上面编写了矩阵，i与n和相关j联m。n翻转and的角色会更常见m，但我坚持使用你的符号，所以请注意这一点。

如果整个数据列表[C]可以舒适地放入内存中，您可以通过编写如下内容来不变地执行此操作：

import Data.List
import Data.List.Split
import Numeric.LinearAlgebra

toMatrices :: Int -> Int -> Int -> [C] -> [Matrix C]
toMatrices l n m = map (reshape m . fromList) . transpose . chunksOf l

这会将输入数据分解为- 大小的l块，将它们转置为l列表，并将每个列表转换为矩阵。如果有某种方法可以Matrix C并行强制所有值，则可以通过一次遍历数据来完成，而无需保留整个列表。不幸的是，单个Matrix C值只能被一个一个地强制，整个列表需要保留，直到它们都可以被强制。

因此，如果“巨大”[C]列表对于内存来说太大了，那么您可能是正确的，您需要将数据加载到（部分）可变结构中。代码编写起来有些挑战性，但它的最终形式并不算太糟糕。我相信以下方法会起作用：

import Data.List
import Numeric.LinearAlgebra
import qualified Data.Vector as V
import qualified Data.Vector.Storable as VS
import qualified Data.Vector.Storable.Mutable as VSM

-- Create an l-length list of n x m hmatrix Matrices
toMatrices :: Int -> Int -> Int -> [C] -> [Matrix C]
toMatrices l n m dats = map (reshape m) $ VS.createT $ do
  mats <- V.replicateM l $ VSM.unsafeNew (m*n)
  sequence_ $ zipWith (\(i,j,k) x ->
      VSM.unsafeWrite (mats V.! k) (loc i j) x) idxs (dats ++ repeat 0)
  return $ V.toList mats

  where idxs = (,,) <$> [0..n-1] <*> [0..m-1] <*> [0..l-1]
        loc i j = i*m + j

test1 = toMatrices 4 3 2 (fromIntegral <$> [1..24])
test2 = toMatrices 1000 1000 100 (fromIntegral <$> [1..])

main = do
  print $ test1
  print $ norm_Inf . foldl1' (+) $ test2

使用编译时-O2，最大驻留量约为 1.6Gigs，这与在内存中保存 100 个包含 100 万个 16 字节复数值的矩阵所需的内存相匹配，因此看起来是正确的。

无论如何，这个版本toMatrices由于使用了三种不同的向量变体而变得有些复杂。有Vectorfrom hmatrix，和不可变的可存储VS.Vectorfrom一样vector；然后还有另外两种类型vector：不可变的 boxedV.Vector和可变的 storable VSM.Vector。

-blockdo创建 a V.Vectorof VSM.Vectors 并使用跨索引/值对执行的一系列 monadic 操作填充那些。您可以通过修改的定义idxs以匹配数据流的顺序，以任何顺序加载数据。-blockdo返回VSM.Vector列表中的最后一个 s，辅助函数VS.createT将它们全部冻结为VS.Vectors（即Vectorfrom hmatrix），并reshape跨向量映射以将它们转换为m-column 矩阵。

请注意，您必须注意，在实际应用程序中，从文件中读取的数据项列表不会被除之外的代码保留toMatrices，无论是原始文本形式还是解析后的数字形式。这应该不会太难做到正确，但是您可能希望在将计算机锁定在真实数据集上之前对中等大小的测试输入进行测试。

haskell - Haskell中可变非整数类型的可变列表

1 回答 1

Related

Reference