It almost certainly depends on the values of old_size
, new_size
and header_size
, and also it depends on the implementation. You'd have to pick some values and measure.
1) is probably best in the case where header_size == old_size-1 && old_size == new_size-1
, since it gives you the best chance of the single realloc
being basically a no-op. (2) should be only very slightly slower in that case (2 almost-no-ops being marginally slower than 1).
3) is probably best in the case where header_size == 1 && old_size == 1024*1024 && new_size == 2048*1024
, because the realloc
would have to move the allocation, but you avoid copying 1MB of data you don't care about. (2) should be only very slightly slower in that case.
2) is probably best when header_size
is much smaller than old_size
, and new_size
is in a range where it's reasonably likely that the realloc
will relocate, but also reasonably likely that it won't. Then you can't predict which of (1) and (3) it is that will be very slightly faster than (2).
In analyzing (2), I have assumed that realloc downwards is approximately free and returns the same pointer. This is not guaranteed. I can think of two things that can mess you up:
- realloc downwards copies to a new allocation
- realloc downwards splits the buffer to create a new chunk of free memory, but then when you realloc back up again the allocator doesn't merge that new free chunk straight back onto your buffer again in order to return without copying.
Either of those could make (2) significantly more expensive than (1). So it's an implementation detail whether or not (2) is a good way of hedging your bets between the advantages of (1) (sometimes avoids copying anything) and the advantages of (3) (sometimes avoids copying too much).
Btw, this kind of idle speculation about performance is more effective in order to tentatively explain your observations, than it is to tentatively predict what observations we would make in the unlikely event that we actually cared enough about performance to test it.
Furthermore, I suspect that for large allocations, the implementation might be able to do even a relocating realloc
without copying anything, by re-mapping the memory to a new address. In which case they would all be fast. I haven't looked into whether implementations actually do that, though.