c++ - 不初始化其成员的C++向量？

Question

我正在为一段 C 代码创建一个 C++ 包装器，该代码返回一个大数组，因此我尝试在vector<unsigned char>.

现在的问题是，数据大约为兆字节，并且vector不必要地初始化了它的存储，这实际上将我的速度降低了一半。

我该如何防止这种情况？

或者，如果不可能——是否有其他 STL 容器可以避免这种不必要的工作？还是我最终必须自己制作容器？

（C++11 之前）

笔记：

我将向量作为输出缓冲区传递。我不是从其他地方复制数据。
它是这样的：

vector<unsigned char> buf(size);   // Why initialize??
GetMyDataFromC(&buf[0], buf.size());

score 57 · Accepted Answer

对于使用用户提供的默认构造函数（未显式初始化任何内容）对结构进行默认和值初始化，不会对 unsigned char 成员执行初始化：

struct uninitialized_char {
    unsigned char m;
    uninitialized_char() {}
};

// just to be safe
static_assert(1 == sizeof(uninitialized_char), "");

std::vector<uninitialized_char> v(4 * (1<<20));

GetMyDataFromC(reinterpret_cast<unsigned char*>(&v[0]), v.size());

我认为这在严格的别名规则下甚至是合法的。

当我将构建时间v与 a进行比较时，vector<unsigned char>我得到了 ~8 µs 和 ~12 ms。速度快 1000 倍以上。编译器是带有 libc++ 和标志的 clang 3.2：-std=c++11 -Os -fcatch-undefined-behavior -ftrapv -pedantic -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-missing-prototypes

C++11 有一个用于未初始化存储的助手，std::aligned_storage。虽然它需要编译时间大小。

这是一个附加示例，用于比较总使用量（以纳秒为单位的时间）：

版本=1 ( vector<unsigned char>):

clang++ -std=c++14 -stdlib=libc++ main.cpp -DVERSION=1 -ftrapv -Weverything -Wno-c++98-compat -Wno-sign-conversion -Wno-sign-compare -Os && ./a.out

initialization+first use: 16,425,554
array initialization: 12,228,039
first use: 4,197,515
second use: 4,404,043

版本=2 ( vector<uninitialized_char>):

clang++ -std=c++14 -stdlib=libc++ main.cpp -DVERSION=2 -ftrapv -Weverything -Wno-c++98-compat -Wno-sign-conversion -Wno-sign-compare -Os && ./a.out

initialization+first use: 7,523,216
array initialization: 12,782
first use: 7,510,434
second use: 4,155,241

#include <iostream>
#include <chrono>
#include <vector>

struct uninitialized_char {
  unsigned char c;
  uninitialized_char() {}
};

void foo(unsigned char *c, int size) {
  for (int i = 0; i < size; ++i) {
    c[i] = '\0';
  }
}

int main() {
  auto start = std::chrono::steady_clock::now();

#if VERSION==1
  using element_type = unsigned char;
#elif VERSION==2
  using element_type = uninitialized_char;
#endif

  std::vector<element_type> v(4 * (1<<20));

  auto end = std::chrono::steady_clock::now();

  foo(reinterpret_cast<unsigned char*>(v.data()), v.size());

  auto end2 = std::chrono::steady_clock::now();

  foo(reinterpret_cast<unsigned char*>(v.data()), v.size());

  auto end3 = std::chrono::steady_clock::now();

  std::cout.imbue(std::locale(""));
  std::cout << "initialization+first use: " << std::chrono::nanoseconds(end2-start).count() << '\n';
  std::cout << "array initialization: " << std::chrono::nanoseconds(end-start).count() << '\n';
  std::cout << "first use: " << std::chrono::nanoseconds(end2-end).count() << '\n';
  std::cout << "second use: " << std::chrono::nanoseconds(end3-end2).count() << '\n';
}

我正在使用 clang svn-3.6.0 r218006

score 9 · Accepted Answer

对不起，没有办法避免。

C++11 添加了一个只需要大小的构造函数，但即使这样也会对数据进行值初始化。

你最好的选择是在堆上分配一个数组，把它放在一个unique_ptr（如果有的话），然后从那里使用它。

如果您愿意，如您所说，“侵入 STL”，您可以随时获取EASTL 的副本来工作。它是某些 STL 容器的变体，允许更多受限的内存条件。您尝试执行的操作的正确实现是为其构造函数提供一个特殊值，这意味着“默认初始化成员”，这对于 POD 类型意味着不做任何事情来初始化内存。当然，这需要使用一些模板元编程来检测它是否是 POD 类型。

score 4 · Accepted Answer

最佳解决方案是简单地将分配器更改为对零参数不做任何事情construct。这意味着底层类型是相同的，它可以避免任何类型的讨厌的 reinterpret_casting 和潜在的别名违规，并且可以非侵入式地取消初始化任何类型。

template<typename T> struct no_initialize : std::allocator<T> {
    void construct(T* p) {}
    template<typename... Args> void construct(T* p, Args&&... args) {
        new (p) T(std::forward<Args>(args)...);
    }
};

score 3 · Accepted Answer

1在您的情况下，使用似乎std::vector既不必要也不明智。您只希望某个对象为您管理一些原始内存。这可以通过以下方式轻松实现

std::unique_ptr<void, void(*)(void*)> p(std::malloc(n), std::free);

2如果你真的想使用你可以使用这里std::vector<>描述的技巧。

score -2 · Accepted Answer

-2

如何使用 vector.reserve() 只分配存储而不初始化它？

于 2013-06-10T15:19:57.760 回答

c++ - *不*初始化其成员的C++向量？

笔记：

5 回答 5

Related

Reference

c++ - 不初始化其成员的C++向量？