c++ - 严格的混叠和对齐

Question

我需要一种在任意 POD 类型之间进行别名的安全方法，明确考虑到 n3242 或更高版本的 3.10/10 和 3.11 符合 ISO-C++11。这里有很多关于严格别名的问题，其中大部分是关于 C 而不是 C++。我找到了一个使用联合的 C 的“解决方案”，可能使用本节

在其元素或非静态数据成员中包含上述类型之一的联合类型

从那我建立了这个。

#include <iostream>

template <typename T, typename U>
T& access_as(U* p)
{
    union dummy_union
    {
        U dummy;
        T destination;
    };

    dummy_union* u = (dummy_union*)p;

    return u->destination;
}

struct test
{
    short s;
    int i;
};

int main()
{
    int buf[2];

    static_assert(sizeof(buf) >= sizeof(double), "");
    static_assert(sizeof(buf) >= sizeof(test), "");

    access_as<double>(buf) = 42.1337;
    std::cout << access_as<double>(buf) << '\n';

    access_as<test>(buf).s = 42;
    access_as<test>(buf).i = 1234;

    std::cout << access_as<test>(buf).s << '\n';
    std::cout << access_as<test>(buf).i << '\n';
}

我的问题是，可以肯定的是，这个程序按照标准是否合法？^*

它不会给出任何警告，并且在使用 MinGW/GCC 4.6.2 编译时可以正常工作：

g++ -std=c++0x -Wall -Wextra -O3 -fstrict-aliasing -o alias.exe alias.cpp

^*编辑：如果不是，怎么能修改它是合法的？

score 15 · Accepted Answer

这永远不会是合法的，无论你用奇怪的演员和工会等等进行什么样的扭曲。

基本事实是：两个不同类型的对象可能永远不会在内存中出现别名，但有一些特殊例外（见下文）。

例子

考虑以下代码：

void sum(double& out, float* in, int count) {
    for(int i = 0; i < count; ++i) {
        out += *in++;
    }
}

让我们将其分解为本地寄存器变量以更紧密地模拟实际执行：

void sum(double& out, float* in, int count) {
    for(int i = 0; i < count; ++i) {
        register double out_val = out; // (1)
        register double in_val = *in; // (2)
        register double tmp = out_val + in_val;
        out = tmp; // (3)
        in++;
    }
}

假设 (1)、(2) 和 (3) 分别表示内存读取、读取和写入，在如此紧密的内部循环中，这可能是非常昂贵的操作。此循环的合理优化如下：

void sum(double& out, float* in, int count) {
    register double tmp = out; // (1)
    for(int i = 0; i < count; ++i) {
        register double in_val = *in; // (2)
        tmp = tmp + in_val;
        in++;
    }
    out = tmp; // (3)
}

这种优化将所需的内存读取次数减少了一半，将内存写入次数减少到 1。这会对代码的性能产生巨大影响，对于所有优化的 C 和 C++ 编译器来说都是非常重要的优化。

现在，假设我们没有严格的别名。假设写入任何类型的对象都会影响任何其他对象。假设写入双精度值会影响某处浮点数的值。这使得上述优化变得可疑，因为程序员实际上可能打算将 out 和 in 设置为别名，从而使 sum 函数的结果更加复杂并受到过程的影响。听起来很愚蠢？即便如此，编译器也无法区分“愚蠢”和“聪明”的代码。编译器只能区分格式正确和格式错误的代码。如果我们允许自由别名，那么编译器必须在其优化中保持保守，并且必须在循环的每次迭代中执行额外的存储 (3)。

希望你现在能明白为什么没有这样的联合或演员戏法可能是合法的。你不能通过诡计来规避这样的基本概念。

严格别名的例外

charC 和 C++ 标准为使用和任何“相关类型”（其中包括派生类型和基类型以及成员）为任何类型起别名做出了特殊规定，因为能够独立使用类成员的地址非常重要。您可以在此答案中找到这些条款的详尽列表。

此外，GCC 为从与上次写入内容不同的联合成员读取数据做了特殊规定。请注意，这种通过联合进行的转换实际上不允许您违反别名。任何时候都只允许一个联合的成员处于活动状态，因此例如，即使使用 GCC，以下行为也是未定义的：

union {
    double d;
    float f[2];
};
f[0] = 3.0f;
f[1] = 5.0f;
sum(d, f, 2); // UB: attempt to treat two members of
              // a union as simultaneously active

解决方法

将一个对象的位重新解释为某种其他类型对象的位的唯一标准方法是使用memcpy. 这利用了char对象别名的特殊规定，实际上允许您在字节级别读取和修改底层对象表示。例如，以下是合法的，并且不违反严格的别名规则：

int a[2];
double d;
static_assert(sizeof(a) == sizeof(d));
memcpy(a, &d, sizeof(d));

这在语义上等同于以下代码：

int a[2];
double d;
static_assert(sizeof(a) == sizeof(d));
for(size_t i = 0; i < sizeof(a); ++i)
   ((char*)a)[i] = ((char*)&d)[i];

GCC 规定从不活跃的工会成员读取数据，隐含地使其活跃。从GCC 文档：

从不同的工会成员那里阅读而不是最近写入的成员（称为“类型双关语”）的做法很常见。即使使用 -fstrict-aliasing，也允许使用类型双关语，前提是通过联合类型访问内存。因此，上面的代码将按预期工作。请参阅结构联合枚举和位域实现。但是，此代码可能不会：

int f() {
    union a_union t;
    int* ip;
    t.d = 3.0;
    ip = &t.i;
    return *ip;
}

类似地，通过获取地址、转换结果指针和取消引用结果的访问具有未定义的行为，即使转换使用联合类型，例如：

int f() {
    double d = 3.0;
    return ((union a_union *) &d)->i;
}

新的展示位置

（注意：我在这里凭记忆进行，因为我现在无法访问标准）。一旦将一个对象放置到存储缓冲区中，底层存储对象的生命周期就会隐式结束。这类似于您给工会成员写信时发生的情况：

union {
    int i;
    float f;
} u;

// No member of u is active. Neither i nor f refer to an lvalue of any type.
u.i = 5;
// The member u.i is now active, and there exists an lvalue (object)
// of type int with the value 5. No float object exists.
u.f = 5.0f;
// The member u.i is no longer active,
// as its lifetime has ended with the assignment.
// The member u.f is now active, and there exists an lvalue (object)
// of type float with the value 5.0f. No int object exists.

现在，让我们看一下与placement-new类似的东西：

#define MAX_(x, y) ((x) > (y) ? (x) : (y))
// new returns suitably aligned memory
char* buffer = new char[MAX_(sizeof(int), sizeof(float))];
// Currently, only char objects exist in the buffer.
new (buffer) int(5);
// An object of type int has been constructed in the memory pointed to by buffer,
// implicitly ending the lifetime of the underlying storage objects.
new (buffer) float(5.0f);
// An object of type int has been constructed in the memory pointed to by buffer,
// implicitly ending the lifetime of the int object that previously occupied the same memory.

出于显而易见的原因，这种隐式的生命周期结束只会发生在具有微不足道的构造函数和析构函数的类型中。

score 6 · Accepted Answer

除了 when 的错误之外sizeof(T) > sizeof(U)，可能存在的问题是联合具有适当且可能比更高的对齐方式U，因为T. 如果你不实例化这个联合，以便它的内存块对齐（并且足够大！）然后获取具有目标类型的成员T，它会在最坏的情况下静默中断。

例如，如果您执行 C 样式转换，则发生对齐错误U*，其中U需要 4 个字节对齐，to dummy_union*，其中dummy_union需要对齐 8 个字节，因为alignof(T) == 8. 之后，您可能会读取类型T对齐为 4 而不是 8 字节的联合成员。

别名转换（仅适用于 POD 的对齐和大小安全 reinterpret_cast）：

这个提议确实明显违反了严格的别名，但使用了静态断言：

///@brief Compile time checked reinterpret_cast where destAlign <= srcAlign && destSize <= srcSize
template<typename _TargetPtrType, typename _ArgType>
inline _TargetPtrType alias_cast(_ArgType* const ptr)
{
    //assert argument alignment at runtime in debug builds
    assert(uintptr_t(ptr) % alignof(_ArgType) == 0);

    typedef typename std::tr1::remove_pointer<_TargetPtrType>::type target_type;
    static_assert(std::tr1::is_pointer<_TargetPtrType>::value && std::tr1::is_pod<target_type>::value, "Target type must be a pointer to POD");
    static_assert(std::tr1::is_pod<_ArgType>::value, "Argument must point to POD");
    static_assert(std::tr1::is_const<_ArgType>::value ? std::tr1::is_const<target_type>::value : true, "const argument must be cast to const target type");
    static_assert(alignof(_ArgType) % alignof(target_type) == 0, "Target alignment must be <= source alignment");
    static_assert(sizeof(_ArgType) >= sizeof(target_type), "Target size must be <= source size");

    //reinterpret cast doesn't remove a const qualifier either
    return reinterpret_cast<_TargetPtrType>(ptr);
}

使用指针类型参数（如标准转换运算符，如 reinterpret_cast ）：

int* x = alias_cast<int*>(any_ptr);

另一种方法（使用临时联合规避对齐和混叠问题）：

template<typename ReturnType, typename ArgType>
inline ReturnType alias_value(const ArgType& x)
{
    //test argument alignment at runtime in debug builds
    assert(uintptr_t(&x) % alignof(ArgType) == 0);

    static_assert(!std::tr1::is_pointer<ReturnType>::value ? !std::tr1::is_const<ReturnType>::value : true, "Target type can't be a const value type");
    static_assert(std::tr1::is_pod<ReturnType>::value, "Target type must be POD");
    static_assert(std::tr1::is_pod<ArgType>::value, "Argument must be of POD type");

    //assure, that we don't read garbage
    static_assert(sizeof(ReturnType) <= sizeof(ArgType),"Target size must be <= argument size");

    union dummy_union
    {
        ArgType x;
        ReturnType r;
    };

    dummy_union dummy;
    dummy.x = x;

    return dummy.r;
}

用法：

struct characters
{
    char c[5];
};

//.....

characters chars;

chars.c[0] = 'a';
chars.c[1] = 'b';
chars.c[2] = 'c';
chars.c[3] = 'd';
chars.c[4] = '\0';

int r = alias_value<int>(chars);

这样做的缺点是，联合可能需要比 ReturnType 实际需要更多的内存

Wrapped memcpy（使用 memcpy 避免对齐和混叠问题）：

template<typename ReturnType, typename ArgType>
inline ReturnType alias_value(const ArgType& x)
{
    //assert argument alignment at runtime in debug builds
    assert(uintptr_t(&x) % alignof(ArgType) == 0);

    static_assert(!std::tr1::is_pointer<ReturnType>::value ? !std::tr1::is_const<ReturnType>::value : true, "Target type can't be a const value type");
    static_assert(std::tr1::is_pod<ReturnType>::value, "Target type must be POD");
    static_assert(std::tr1::is_pod<ArgType>::value, "Argument must be of POD type");

    //assure, that we don't read garbage
    static_assert(sizeof(ReturnType) <= sizeof(ArgType),"Target size must be <= argument size");

    ReturnType r;
    memcpy(&r,&x,sizeof(ReturnType));

    return r;
}

对于任何 POD 类型的动态大小的数组：

template<typename ReturnType, typename ElementType>
ReturnType alias_value(const ElementType* const array,const size_t size)
{
    //assert argument alignment at runtime in debug builds
    assert(uintptr_t(array) % alignof(ElementType) == 0);

    static const size_t min_element_count = (sizeof(ReturnType) / sizeof(ElementType)) + (sizeof(ReturnType) % sizeof(ElementType) != 0 ? 1 : 0);

    static_assert(!std::tr1::is_pointer<ReturnType>::value ? !std::tr1::is_const<ReturnType>::value : true, "Target type can't be a const value type");
    static_assert(std::tr1::is_pod<ReturnType>::value, "Target type must be POD");
    static_assert(std::tr1::is_pod<ElementType>::value, "Array elements must be of POD type");

    //check for minimum element count in array
    if(size < min_element_count)
        throw std::invalid_argument("insufficient array size");

    ReturnType r;
    memcpy(&r,array,sizeof(ReturnType));
    return r;
}

更有效的方法可以使用内在函数进行显式未对齐读取，例如来自 SSE 的读取，以提取原语。

例子：

struct sample_struct
{
    char c[4];
    int _aligner;
};

int test(void)
{
    const sample_struct constPOD    = {};
    sample_struct pod               = {};
    const char* str                 = "abcd";

    const int* constIntPtr  = alias_cast<const int*>(&constPOD);
    void* voidPtr           = alias_value<void*>(pod);
    int intValue            = alias_value<int>(str,strlen(str));

    return 0;
}

编辑：

可以改进仅确保 POD 转换的断言。
删除了多余的模板助手，现在仅使用 tr1 特征
用于澄清和禁止 const 值（非指针）返回类型的静态断言
调试构建的运行时断言
为一些函数参数添加了 const 限定符
另一种使用 memcpy 的双关函数
重构
小例子

score 4 · Accepted Answer

我认为在最基本的层面上，这是不可能的，并且违反了严格的混叠。您唯一能做到的就是欺骗编译器使其不注意。

score 2 · Accepted Answer

我的问题是，可以肯定的是，这个程序按照标准是否合法？

不可以。使用您提供的别名，对齐可能不自然。您编写的联合只是移动了别名的点。它可能看起来可以工作，但是当 CPU 选项、ABI 或编译器设置更改时，该程序可能会失败。

如果不是，如何将其修改为合法？

创建自然临时变量并将您的存储视为内存 blob（移入和移出 blob 到/从临时对象），或使用代表所有类型的联合（请记住，这里一次只有一个活动元素）。