c++ - What happens under the hood of vector::push_back memory wise?

Question

My question is regarding the effect of vector::push_back, I know it adds an element in the end of the vector but what happens underneath the hood?

IIRC memory objects are allocated in a sequential manner, so my question is whether vector::push_back simply allocates more memory immediately after the vector, and if so what happens if there is not enough free memory in that location? Or perhaps a pointer is added in the "end" to cause the vector to "hop" to the location it continues? Or is it simply reallocated through copying it to another location that has enough space and the old copy gets discarded? Or maybe something else?

score 23 · Accepted Answer

如果已经分配了足够的空间，则该对象是从就地参数复制构造的。当没有足够的内存时，向量将按照某种几何级数增长它的内部数据缓冲区（每次新大小将k*old_size使用k > 1^[1]），然后将原始缓冲区中存在的所有对象移动到新缓冲区。操作完成后，旧缓冲区将被释放到系统。

在上一句中， move没有用于技术move-constructor / move-assignment 的意义，它们可以被移动或复制或任何等效的操作。

^[1]按一个因子增长k > 1可确保摊销成本push_back不变。实际常量因一种实现而异（Dinkumware 使用 1.5，gcc 使用 2）。摊销成本意味着即使每隔一段时间push_back就会非常昂贵（O(N)在当时的向量大小上），这些情况很少发生，以至于整个插入集的所有操作的成本与数量成线性关系插入，因此每次插入平均成本不变）

score 4 · Accepted Answer

当向量空间不足时，它将使用它的分配器来保留更多空间。

由分配器决定如何实现。

但是，向量决定了要保留多少空间：标准保证向量容量应以几何方式至少增长 1.5 ¹ 倍（见注释），从而防止由于重复的“小”分配而导致可怕的性能。

关于元素的物理移动/复制：

如果支持移动分配和构造，符合 c++11 的实现将移动元素
我知道的大多数实现（尤其是 g++）只会将 std::copy 用于 POD 类型；POD 类型的算法特化确保它编译成（本质上）一个 memcpy 操作。这反过来会在您系统上最快的任何 CPU 指令中编译（例如 SSE2 指令）

¹我尝试从 n3242 标准草案文档中找到参考报价，但此时我无法找到它

score 2 · Accepted Answer

向量保证所有元素在内存中都是连续的。

在内部，您可以将其视为定义为三个指针（或类似指针的行为）：

start:     Points at the beginning of the allocated block.
final:     Points one past the last element in the vector.
           If the vector is empty then start == final 
capacity:  Points one past the end of allocated memory.
           If final == capacity there is no room left.

当你向后推。

如果 final 小于容量：
- 新元素被复制到 final 指向的位置
- final 递增到下一个位置。
如果 final 与容量相同，则向量是满的
- 必须分配新的内存。
- 然后编译器将分配X*(capacity - start)*sizeof(t)字节。
- 其中 X 通常是 1.5 到 2 之间的值。
- 然后它将所有值从旧内存缓冲区复制到新内存缓冲区。
- 新值被添加到缓冲区。
- 传输开始/最终/容量指针。
- 释放旧缓冲区

score 2 · Accepted Answer

当vector空间用完时，它会被重新分配，所有元素都被复制到新数组中。然后销毁旧数组。

为了避免过多的分配并保持平均push_back()时间为O(1)，重新分配要求大小至少增加一个常数因子。（1.5和2很常见）

score 0 · Accepted Answer

当您调用结束指针时，将与vector::push_back容量指针进行比较。如果有足够的空间容纳新对象，则调用在可用空间中构造对象并递增结束指针。placement new

如果没有足够的空间，则vector调用其分配器为至少现有元素和新元素分配足够的连续空间（不同的实现可能会通过不同的乘数增加分配的内存）。然后将所有现有元素加上新元素复制到新分配的空间。

score 0 · Accepted Answer

std::vector 过度分配- 它通常会自动分配比需要更多的内存。size不受此影响，但您可以通过capacity.

如果额外容量不足，std::vector 将复制所有内容。

std::vector 分配的内存是原始的，没有构造函数按需调用，使用放置新。

因此， push_back 会：

如果容量不足以容纳新元素，它将
- 分配一个新块
- 复制所有现有元素（通常使用复制构造函数）
将大小增加一
将新元素复制到新位置

score 0 · Accepted Answer

如果您对数组的最终大小有所了解，请先尝试vector::reserve内存。请注意，reserve与不同vector::resize。随着你reserve的vector::size()数组没有改变

c++ - What happens under the hood of vector::push_back memory wise?

7 回答 7

Related

Reference