2

I want to use a vector with the custom allocator below, in which construct() and destroy() have an empty body:

struct MyAllocator : public std::allocator<char> {
    typedef allocator<char> Alloc;
    //void destroy(Alloc::pointer p) {} // pre-c+11
    //void construct(Alloc::pointer p, Alloc::const_reference val) {} // pre-c++11
    template< class U > void destroy(U* p) {}
    template< class U, class... Args > void construct(U* p, Args&&... args) {}
    template<typename U> struct rebind {typedef MyAllocator other;};
};

Now for the reasons I have specified in another question, the vector has to be resized several times in a loop. To simplify my tests on performance, I made a very simple loop like the following:

std::vector<char, MyAllocator> v;
v.reserve(1000000); // or more. Make sure there is always enough allocated memory
while (true) {
   v.resize(1000000);
   // sleep for 10 ms
   v.clear(); // or v.resize(0);
};

I noticed that changing the size that way the CPU consumption increases from 30% to 80%, despite the allocator has empty construct() and destroy() member functions. I would have expected a very minimal impact or no impact at all (with optimization enabled) on performance because of that. How is that consumption increment possible? A second question is: why when reading the memory after any resize, I see that the value of each char in the resized memory is 0 (I would expect some non-zero values, since constuct() does nothing) ?

My environment is g++4.7.0 , -O3 level optimization enabled. PC Intel dual core, 4GB of free memory. Apparently calls to construct could not be optimized out at all?

4

2 回答 2

2

更新

这是一个完整的重写。原始帖子/我的答案中有一个错误,这使我对同一个分配器进行了两次基准测试。哎呀。

好吧,我可以看到性能上的巨大差异。我已经制作了以下测试台,它采取了一些预防措施来确保关键的东西没有被完全优化。然后我验证(使用-O0 -fno-inline)分配器constructdestruct调用被调用的预期次数(是):

#include <vector>
#include <cstdlib>

template<typename T>
struct MyAllocator : public std::allocator<T> {
    typedef std::allocator<T> Alloc;
    //void destroy(Alloc::pointer p) {} // pre-c+11
    //void construct(Alloc::pointer p, Alloc::const_reference val) {} // pre-c++11
    template< class U > void destroy(U* p) {}
    template< class U, class... Args > void construct(U* p, Args&&... args) {}
    template<typename U> struct rebind {typedef MyAllocator other;};
};

int main()
{
    typedef char T;
#ifdef OWN_ALLOCATOR
    std::vector<T, MyAllocator<T> > v;
#else
    std::vector<T> v;
#endif
    volatile unsigned long long x = 0;
    v.reserve(1000000); // or more. Make sure there is always enough allocated memory
    for(auto i=0ul; i< 1<<18; i++) {
        v.resize(1000000);
        x += v[rand()%v.size()];//._x;
        v.clear(); // or v.resize(0);
    };
}

时间差标注:

g++ -g -O3 -std=c++0x -I ~/custom/boost/ test.cpp -o test 

real    0m9.300s
user    0m9.289s
sys 0m0.000s

g++ -g -O3 -std=c++0x -DOWN_ALLOCATOR -I ~/custom/boost/ test.cpp -o test 

real    0m0.004s
user    0m0.000s
sys 0m0.000s

我只能假设您所看到的与标准库优化分配器操作有关char(它是一种 POD 类型)。

当你使用时,时间会变得更远

struct NonTrivial
{
    NonTrivial() { _x = 42; }
    virtual ~NonTrivial() {}
    char _x;
};

typedef NonTrivial T;

在这种情况下,默认分配器需要超过 2 分钟(仍在运行)。而“虚拟” MyAllocator 花费约 0.006 秒。(请注意,这会调用未正确初始化的引用元素的未定义行为。)

于 2013-03-06T02:13:34.037 回答
0

(感谢下面的 GManNickG 和 Jonathan Wakely 的更正)

在 C++11 中,通过http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3346.pdf提出的标准后修正,resize()将使用自定义分配器构造添加的元素.

在早期版本中,resize()value 初始化添加的元素,这需要时间。

这些初始化步骤与内存分配无关,它是分配后对内存所做的事情。值初始化是不可避免的费用。

鉴于当前编译器中 C++11 标准的符合性状态,有必要查看您的标头以了解正在使用哪种方法。

值初始化有时是不必要且不方便的,但也保护了许多程序免受意外错误的影响。例如,有人可能认为他们可以调整 a的大小std::vector<std::string>以拥有 100 个“未初始化”字符串,然后在读取它们之前开始分配它们,但赋值运算符的先决条件是被更改的对象已经正确构造......否则它' 可能会找到一个垃圾指针并尝试delete[]它。只有仔细放置new每个元素才能安全地构建它们。API 设计偏向于健壮性。

于 2013-03-06T01:48:59.003 回答