Clang has a C/C++ extension that which allows you to treat vector values as first-class citizens:
typedef double double4 __attribute__((ext_vector_type(4));
// easy assignment
double4 a = {1, 2, 3, 4};
double4 b = {4, 3, 2, 1};
// basic operators work component-wise
double4 c = a + b; // {5, 5, 5, 5}
// you can even swizzle elements!
double4 d = a.zyxw; // {3, 2, 1, 4}
I would believe that these vectors make use of the underlying platform's SIMD instructions (SSE on Intel Macs, NEON on ARM). However, I'm not too sure how the Mac OS calling convention deals with vector types.
Will it be more efficient to pass vectors by reference or by copy? The difference might not be huge, but since I'll be passing around a lot of vectors, I figured I might pick up the right habit as soon as possible.