If clang++ -stdlib=libstdc++
doesn't solve your problem, link with -latomic
for the implementation of these functions.
Try to get your compiler to inline 8-byte and narrower atomics, though, because the library functions have potentially large downsides.
Beware that the library functions don't support a memory ordering weaker than memory_order_seq_cst
, so they always use mfence
on x86, even if the source used relaxed
.
The 32-bit x86 version of __atomic_store_8
is even worse: it uses lock cmpxchg8b
instead of an SSE or x87 8-byte store. This makes it work even if it's misaligned, but at a massive performance penalty. It also has two redundant lock or [esp], 0
instructions as extra barriers around loading its arguments from the stack. (I'm looking at /usr/lib32/libatomic.so.1.2.0
from gcc7.1.1 on Arch Linux.)
Ironically, current gcc -m32 (in C11 mode, not C++11) under-aligns atomic_llong
inside a struct, but inlines movq xmm
loads/stores, so it's not actually atomic. (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65146#c4)
Current clang -m32 aligns atomic_llong
to 8 bytes even inside structs (unlike regular long long
, which the i386 System V ABI only aligns to 4B). The irony is that clang generates calls to the library functions, which uses a lock cmpxchg8b
(https://bugs.llvm.org/show_bug.cgi?id=33109) so it actually is atomic even with cache-line splits. (Why is integer assignment on a naturally aligned variable atomic on x86?).
So clang is safe even if some gcc-compiled code passes it a pointer to a misaligned _Atomic long long
. But it disagrees with gcc about struct layout, so this can only help if it gets a pointer to the atomic variable directly, rather than the containing struct.
Related: Atomic double floating point or SSE/AVX vector load/store on x86_64