What happens is this:
On the caller's side, a return slot is provided which can hold the result, that means that the caller provides the memory for the variable of type std::vector<int>
. It expects the called method to construct the value and is itself responsible for calling the destructor when the result is no longer used and freeing the memory (if necessary, it probably just lives on the stack).
The called function (which may live in a different translation unit!) would, without the NRVO, so this:
- Provide a memory slot for
ret
.
- Construct a local variable
ret
in this memory slot.
- Do stuff...
- Copy-construct the return value in the provided memory slot by copying
ret
.
- Call
ret
's destructor.
Now, with the NRVO, the decision to optimize this can be done in the called function's translation unit. It transforms the above into:
- Construct
ret
in the memory of the method's return slot.
- Do stuff...
No need to do anything else as the memory is owned and the destructor is called by the caller and because the optimization is transparent for the caller :)
This, of course, can't eliminate the assignment into v
in your example. If you store the result in a different variable, e.g.
std::vector<int> w = f(v);
the NRVO will construct ret
directly into w
's memory (as this will be passed in as the return slot to f
).