Different alignment rules or type widths are the two main ways you could get a difference, but -march=
doesn't change that, not when compiling for the same ABI on the same ISA. (Otherwise -march=skylake-avx512
code couldn't call -march=sandybridge
code and vice versa, if they disagreed on struct layouts.)
Compiling for a different ABI can save space especially in pointer-heavy data structures. Specifically an ILP32 ABI such as Linux x32 has 4 byte pointers instead of 8, so struct foo { foo *next; int val; };
is 8 bytes instead of 16 (after padding to make sizeof(foo)
a multiple of the alignof(foo)
it inherits from pointers needing 8-byte alignment). But that won't work for your use-case of 100GB of data; 32-bit pointers limit you to 4GiB of address space.
-march=
could have some small effect on stack space when auto-vectorizing. e.g. a function might align the stack by 64 in order to spill/reload a ZMM vector.
Or with older GCC, align even if the final asm doesn't actually store or load any vectors to the stack frame. But that's at most an extra 56 bytes of wasted stack space per level of function nesting, vs. 16-byte alignment which can be had for free as part of the calling convention.
GCC / clang's optimizers won't AFAIK do any optimizations that change the size of dynamic allocations. Clang can sometimes optimize away a dynamic allocation entirely in a function that for example creates and destroys a std::vector<float> foo(100);
and all accesses to it can be optimized away. (e.g. store constants into the vector and then read them back, it can just optimize that away then eliminate the allocation, too. Or a std::vector
that isn't even used.)
Possibly a different allocator library that's better at reducing internal fragmentation could save space, if you end up with some memory pages allocated but not fully used. But that's not something -march=
affects.