在我的项目中,我必须从 CUDA (GPU) 设备(从视频卡的内存到 std::valarray)复制 std::valarray(或 std::vector)中的大量数值数据。
所以我需要尽可能快地调整这些数据结构的大小,但是当我调用成员方法 vector::resize 时,它会使用循环将数组的所有元素初始化为默认值。
// In a super simplified description resize behave like this pseudocode:
vector<T>::resize(N){
// Setup the new size
// allocate the new array
this->_internal_vector = new T[N];
// init to default
// This loop is slow !!!!
for ( i = 0; i < N ; ++i){
this->_internal_vector[i] = T();
}
}
显然我不需要这个初始化,因为我必须从 GPU 复制数据并且所有旧数据都被覆盖。并且初始化需要一些时间;所以我失去了表现。
为了处理我需要分配内存的数据;由方法 resize() 生成。
我非常肮脏和错误的解决方案是使用方法vector::reserve(),但是我失去了vector的所有特征;如果我调整数据大小,则将其替换为默认值。
因此,如果您知道,有一种策略可以避免这种预初始化为默认值(在 valarray 或向量中)。
I want a method resize that behave like this:
vector<T>::resize(N) {
// Allocate the memory.
this->_internal_vector = new T[N];
// Update the the size of the vector or valarray
// !! DO NOT initialize the new values.
}
表演示例:
#include <chrono>
#include <iostream>
#include <valarray>
#include <vector>
int main() {
std::vector<double> vec;
std::valarray<double> vec2;
double *vec_raw;
unsigned int N = 100000000;
std::clock_t start;
double duration;
start = std::clock();
// Dirty solution!
vec.reserve(N);
duration = (std::clock() - start) / (double)CLOCKS_PER_SEC;
std::cout << "duration reserve: " << duration << std::endl;
start = std::clock();
vec_raw = new double[N];
duration = (std::clock() - start) / (double)CLOCKS_PER_SEC;
std::cout << "duration new: " << duration << std::endl;
start = std::clock();
for (unsigned int i = 0; i < N; ++i) {
vec_raw[i] = 0;
}
duration = (std::clock() - start) / (double)CLOCKS_PER_SEC;
std::cout << "duration raw init: " << duration << std::endl;
start = std::clock();
// Dirty solution
for (unsigned int i = 0; i < vec.capacity(); ++i) {
vec[i] = 0;
}
duration = (std::clock() - start) / (double)CLOCKS_PER_SEC;
std::cout << "duration vec init dirty: " << duration << std::endl;
start = std::clock();
vec2.resize(N);
duration = (std::clock() - start) / (double)CLOCKS_PER_SEC;
std::cout << "duration valarray resize: " << duration << std::endl;
return 0;
}
输出:
duration reserve: 1.1e-05
duration new: 1e-05
duration raw init: 0.222263
duration vec init dirty: 0.214459
duration valarray resize: 0.215735
注意:替换 std::allocator 不起作用,因为循环是由 resize() 调用的。