c++ - 设计需要暂存空间的算法

Question

C++ 标准库将数据结构与算法分开，例如std::sort：

template< class RandomAccessIterator >
void sort( RandomAccessIterator first, RandomAccessIterator last );

当算法需要中间暂存空间时，我想保持算法和数据结构的分离。

考虑到这个目标，我想实现一种图像算法，该算法需要输入和输出图像之间的中间暂存空间。可以在函数调用中分配必要的暂存空间，但是由于这些调用的大小和频率相同，图像大小相同，这会严重降低性能。这使得将数据结构与算法分离变得更加困难。

实现此目的的一种可能方法如下：

// Algorithm function
template<typename InputImageView, typename OutputImageView, typename ScratchView>
void algorithm(
  InputImageView inputImageView, 
  OutputImageView outputImageView, 
  ScratchView scratchView
);

// Algorithm class with scratch space
template<typename DataStructure>
class Algorithm {
public:
  template<typename InputImageView,typename OutputImageView>
  void operator()(
  InputImageView inputImageView, 
  OutputImageView outputImageView
  ){
    m_scratch.resize(inputImageView.size());
    algorithm(inputImageView,outputImageView,makeView(m_scratch));
  }

private:
  DataStructure m_scratch;
}

以上是一种有效的算法+暂存空间设计，还是有更好的方法？

旁注：我正在使用boost::gil库

score 3 · Accepted Answer

我认为在这种情况下，我会让算法允许您传递（引用或指向）暂存空间的结构，并为该参数提供默认值。这样，当/如果分配结构的额外时间不是问题时，用户可以在不传递结构的情况下调用函数，但是如果（例如）构建可以从重复重用相同空间中受益的处理管道，则可以传递一个.

score 2 · Accepted Answer

如果你使用一个函数对象，你可以携带任何你需要的状态。

两个有用的算法是transform和accumulate。

transform可以带一个函数对象对序列中的每个对象进行转换：

class TransformerWithScratchSpace {
public:
    Target operator()(const Source& source);
};

vector<Source> sources;
vector<Target> targets;
targets.reserve(sources.size());

transform(sources.begin(), sources.end(),
          back_inserter<Target>(targets),
          TransformerWithScratchSpace());

accumulate可以采用一个函数对象，它将所有对象累积到自身中。结果是累积的对象。蓄能器本身不必产生任何东西。

class Accumulator {
public:
    Accumulator& operator+()(const Source& source);
};

vector<Source> sources;

Accumulator accumulated = accumulate(sources.begin(), sources.end(),
                                     Accumulator());

score 1 · Accepted Answer

您最初的问题使用的设计resize()效率不高，因为调整大小可能不仅需要分配，而且还会将现有内容从旧分配复制到新分配。它还需要在释放旧空间之前分配和填充新空间，从而增加最大峰值内存使用量。

最好为客户端代码提供某种方法来计算必须提供多大的结构作为暂存空间，然后断言传递的暂存空间满足库例程在入口处的需要。计算可以是算法类的另一种方法，或者暂存空间对象的分配/工厂可以采用适当的代表性参数（正确的大小/形状，或大小本身）并返回合适且可重用的暂存空间对象。

一旦被要求使用，工作者算法不应该以任何方式“操纵”暂存空间以使其适合，因为这种操纵往往很昂贵。

score 1 · Accepted Answer

正如你所提到的，这个问题真的可以被认为远远超出了暂存空间中的图像。我实际上以许多不同的形式遇到过这种情况（内存部分、类型化数组、线程、网络连接......）。

所以我最终做的是给自己写一个通用的“ BufferPool”。它是一个管理任何形式的缓冲区对象的类，无论是字节数组、其他一些内存，还是（在您的情况下）分配的图像。我从ThreadPool.

这是一个相当简单的类，它维护一个对象池，当您需要一个Buffer对象时，您可以从中缓冲，当您完成它时，它会返回到池中。该函数将检查池中是否有可用的缓冲区，如果没有，将创建一个新的。如果池中有一个，它将清除它，即清除它，以便其行为与新创建的相同。acquirereleaseacquireresetBuffer

然后我有几个 this 的静态实例，一个用于我使用BufferPool的每种不同类型：一个用于数组，一个用于数组，......（我正在用 Java 编码，以防你想知道...... :) 我然后在我正在编写的所有库函数中使用这些静态实例。例如，这允许我的加密函数可以与我的二进制展平函数或我的应用程序中的任何其他代码共享字节数组。通过这种方式，我可以最大限度地重用这些对象，并且在许多情况下它给了我很大的性能提升。Bufferbytechar

在 C++ 中，您可以通过基于这种池技术为所需的数据结构编写自定义分配器来非常优雅地实现这种无处不在的方案（感谢 Andrew 指出这一点；请参阅评论）。

我为我的字节数组缓冲区做的一件事是该acquire函数将接受一个minimumLength参数，该参数指定我需要的缓冲区的最小大小。然后，它只会从池中返回一个至少具有此长度的字节数组，或者如果池为空或其中只有较小的图像，则创建一个新的。您可以对图像缓冲区使用相同的方法。让acquire函数接受一个minWidth和minHeight参数，然后从池中返回至少具有这些尺寸的图像，或者创建具有这些尺寸的图像。然后，您可以让该reset功能仅清除图像的 (0, 0) 到 ( minWidth, minHeight) 部分，如果您甚至需要清除它。

我决定在我的代码中不担心的一个功能，但您可能需要考虑取决于您的应用程序将运行多长时间以及它将处理多少不同的图像大小是您是否想以某种方式限制缓冲区大小并释放缓存的图像以减少应用程序的内存使用。

举个例子，这是我用于我的代码ByteArrayPool：

public class ByteArrayPool {

    private static final Map<Integer, Stack<byte[]>> POOL = new HashMap<Integer, Stack<byte[]>>();

    /**
     * Returns a <code>byte[]</code> of the given length from the pool after clearing
     * it with 0's, if one is available. Otherwise returns a fresh <code>byte[]</code>
     * of the given length.
     * 
     * @param length the length of the <code>byte[]</code>
     * @return a fresh or zero'd buffer object
     */
    public static byte[] acquire(int length) {
        Stack<byte[]> stack = POOL.get(length);
        if (stack==null) {
            if (CompileFlags.DEBUG) System.out.println("Creating new byte[] pool of lenth "+length);
            return new byte[length];
        }
        if (stack.empty()) return new byte[length];
        byte[] result = stack.pop();
        Arrays.fill(result, (byte) 0);
        return result;
    }

    /**
     * Returns a <code>byte[]</code> of the given length from the pool after optionally clearing
     * it with 0's, if one is available. Otherwise returns a fresh <code>byte[]</code>
     * of the given length.<br/>
     * <br/>
     * If the initialized state of the needed <code>byte[]</code> is irrelevant, calling this
     * method with <code>zero</code> set to <code>false</code> leads to the best performance.
     * 
     * @param length the length of the <code>byte[]</code>
     * @param zero T - initialize a reused array to 0
     * @return a fresh or optionally zero'd buffer object
     */
    public static byte[] acquire(int length, boolean zero) {
        Stack<byte[]> stack = POOL.get(length);
        if (stack==null) {
            if (CompileFlags.DEBUG) System.out.println("Creating new byte[] pool of lenth "+length);
            return new byte[length];
        }
        if (stack.empty()) return new byte[length];
        byte[] result = stack.pop();
        if (zero) Arrays.fill(result, (byte) 0);
        return result;
    }

    /**
     * Returns a <code>byte[]</code> of the given length from the pool after setting all
     * of its entries to the given <code>initializationValue</code>, if one is available.
     * Otherwise returns a fresh <code>byte[]</code> of the given length, which is also
     * initialized to the given <code>initializationValue</code>.<br/>
     * <br/>
     * For performance reasons, do not use this method with <code>initializationValue</code>
     * set to <code>0</code>. Use <code>acquire(<i>length</i>)</code> instead.
     * 
     * @param length the length of the <code>byte[]</code>
     * @param initializationValue the
     * @return a fresh or zero'd buffer object
     */
    public static byte[] acquire(int length, byte initializationValue) {
        Stack<byte[]> stack = POOL.get(length);
        if (stack==null) {
            if (CompileFlags.DEBUG) System.out.println("Creating new byte[] pool of lenth "+length);
            byte[] result = new byte[length];
            Arrays.fill(result, initializationValue);
            return result;
        }
        if (stack.empty()) {
            byte[] result = new byte[length];
            Arrays.fill(result, initializationValue);
            return result;
        }
        byte[] result = stack.pop();
        Arrays.fill(result, initializationValue);
        return result;
    }

    /**
     * Puts the given <code>byte[]</code> back into the <code>ByteArrayPool</code>
     * for future reuse.
     * 
     * @param buffer the <code>byte[]</code> to return to the pool
     */
    public static byte[] release(byte[] buffer) {
        Stack<byte[]> stack = POOL.get(buffer.length);
        if (stack==null) {
            stack = new Stack<byte[]>();
            POOL.put(buffer.length, stack);
        }
        stack.push(buffer);
        return buffer;
    }
}

然后，在我需要的所有代码的其余部分中byte[]，我使用类似的东西：

byte[] buffer = ByteArrayPool.acquire(65536, false);
try {
    // Do something requiring a byte[] of length 65536 or longer
} finally {
    ByteArrayPool.release(buffer);
}

请注意，我是如何添加 3 个不同的acquire函数的，这些函数允许我指定我需要的缓冲区的“干净”程度。例如，如果我无论如何都要覆盖所有它，则无需浪费时间先将其归零。

c++ - 设计需要暂存空间的算法

4 回答 4

Related

Reference