在我的项目中,我必须从 CUDA (GPU) 设备(从视频卡的内存到 std::valarray)复制 std::valarray(或 std::vector)中的大量数值数据。
所以我需要尽可能快地调整这些数据结构的大小,但是当我调用成员方法 vector::resize 时,它会使用循环将数组的所有元素初始化为默认值。
// In a super simplified description resize behave like this pseudocode:
vector<T>::resize(N){
   // Setup the new size
   // allocate the new array
   this->_internal_vector = new T[N];
   // init to default
   // This loop is slow !!!!
   for ( i = 0; i < N ; ++i){
      this->_internal_vector[i] = T();
   }
}
显然我不需要这个初始化,因为我必须从 GPU 复制数据并且所有旧数据都被覆盖。并且初始化需要一些时间;所以我失去了表现。
为了处理我需要分配内存的数据;由方法 resize() 生成。
我非常肮脏和错误的解决方案是使用方法vector::reserve(),但是我失去了vector的所有特征;如果我调整数据大小,则将其替换为默认值。
因此,如果您知道,有一种策略可以避免这种预初始化为默认值(在 valarray 或向量中)。
I want a method resize that behave like this:
vector<T>::resize(N) {
    // Allocate the memory.
    this->_internal_vector = new T[N];
    // Update the the size of the vector or valarray
    // !! DO NOT initialize the new values.
}
表演示例:
#include <chrono>
#include <iostream>
#include <valarray>
#include <vector>
int main() {
  std::vector<double> vec;
  std::valarray<double> vec2;
  double *vec_raw;
  unsigned int N = 100000000;
  std::clock_t start;
  double duration;
  start = std::clock();
  // Dirty solution!
  vec.reserve(N);
  duration = (std::clock() - start) / (double)CLOCKS_PER_SEC;
  std::cout << "duration reserve: " << duration << std::endl;
  start = std::clock();
  vec_raw = new double[N];
  duration = (std::clock() - start) / (double)CLOCKS_PER_SEC;
  std::cout << "duration new: " << duration << std::endl;
  start = std::clock();
  for (unsigned int i = 0; i < N; ++i) {
    vec_raw[i] = 0;
  }
  duration = (std::clock() - start) / (double)CLOCKS_PER_SEC;
  std::cout << "duration raw init: " << duration << std::endl;
  start = std::clock();
  // Dirty solution
  for (unsigned int i = 0; i < vec.capacity(); ++i) {
    vec[i] = 0;
  }
  duration = (std::clock() - start) / (double)CLOCKS_PER_SEC;
  std::cout << "duration vec init dirty: " << duration << std::endl;
  start = std::clock();
  vec2.resize(N);
  duration = (std::clock() - start) / (double)CLOCKS_PER_SEC;
  std::cout << "duration valarray resize: " << duration << std::endl;
  return 0;
}
输出:
duration reserve: 1.1e-05
duration new: 1e-05
duration raw init: 0.222263
duration vec init dirty: 0.214459
duration valarray resize: 0.215735
注意:替换 std::allocator 不起作用,因为循环是由 resize() 调用的。