1

我有两个问题,一个是关于指针类型操作的一般性问题,一个是针对我的特定情况的一个问题。

当您使用不同类型的指针访问内存缓冲区时会发生什么?

在许多不同编译器的实践中,它似乎按照我的大脑想要的方式运行。但是,我有点知道它在很多情况下都是 UB(如果不是所有情况)。例如:

typedef unsigned char byte;
struct color { /* stuff */};

std::vector<color> colors( 512 * 512 );
// pointer of one type
color* colordata = colors.data();
// pointer to another type?
byte* bytes = reinterpret_cast<byte*>( colordata );

// Proceed to read from (potentially write into) 
// the "bytes" of the 512 * 512 heap array

第一个问题是:进行这种转换是否合法/安全/标准认可?

第二个问题:分离第一个问题,如果您知道struct命名color被定义为:

 struct color { byte c[4]; };

现在,它是合法的/安全的/标准的吗?阅读安全吗?读/写安全吗?我想知道,正如我的直觉告诉我的那样,对于这些非常简单的结构,上述顽皮的指针操作并没有那么糟糕,或者是吗?

[重新打开原因:]虽然关于严格别名的链接问题在这里有些适用,但主要是关于 C。与 C++11 标准相比,引用 C++03 标准的唯一答案可能已经过时(除非绝对没有任何改变)。这个问题有一个实际的应用,我和其他人会从更多的答案中受益。最后,这个问题非常具体地询问它是否不仅是读安全的,写安全的,还是两者兼而有之(或两者都不是,并且在两种不同的情况下(基础类型匹配的 PoD 数据和任意内部数据的更一般情况) )。

4

2 回答 2

5

两者都是合法的。

首先,因为byte它是 的 typedef unsigned char,所以当涉及到严格的别名时,它有一个神奇的越狱机制。您可以将任何类型别名为char有符号或无符号派生类之一。

其次,在 C 和 C++ 中,将结构强制转换为指向其第一个元素的类型的指针是完全合法的,只要它满足某些保证,例如是 POD。这意味着

struct x {
    int f;
};
int main() {
    x var;
    int* p = (int*)&var;
}

也不违反严格的别名,即使没有用于 的 getout 子句char

于 2013-07-13T02:47:25.290 回答
3

As has been stated in the comments: Accessing the same piece of memory as two different types is UB. So, that's the formal answer (note that "UB" does include "doing precisely what you would expect if you are a sane person reading the code" as well as "just about anything other than what a sane person reading the code would expect")

Having said that, it appears that all popular compilers tend to cope with this fairly well. It is not unusual to see these sort of constructs (in "good" production code - even if the code isn't strictly language-lawyer correct). However, you are at the mercy of the compiler "doing the right thing", and it's definitely a case where you may find compiler bugs if you stress things too harshly.

There are several reasons that the standard defines this as UB - the main one being that "different types of data may be stored in different memory" and "it can be hard for the compiler to figure out what is safe when someone is mucking about casting pointers to the same data with different types" - e.g. if we have a pointer to a 32-bit integer and another pointer to char, both pointing to the same address, when is it safe to read the integer value after the char value has been written. By defining it as UB, it's entirely up to the compiler vendor to decide how precisely they want to treat these conditions. If it was "defined" that this will work, compilers may not be viable for certain processor types (or code would become horribly slow due to the effect of the liberal sprinkling of "make sure partial memory writes have completed before I read" operations, even when those are generally not needed).

So, in summary: It will most likely work on most processors, but don't expect any language lawyer to approve of your code.

于 2013-05-08T10:39:57.517 回答