4

我正在尝试查找 c 和 c++ 编译器将字符串放入可执行文件的数据部分的规则,但不知道在哪里查找。我想知道规范是否保证以下所有地址在 c/c++ 中是相同的:

char * test1 = "hello";
const char * test2 = "hello";
static char * test3 = "hello";
static const char * test4 = "hello";
extern const char * test5; // Defined in another compilation unit as "hello"
extern const char * test6; // Defined in another shared object as "hello"

在windows上测试,都是一样的。但是我不知道它们是否会在所有操作系统上。

4

4 回答 4

8

我想知道规范是否保证以下所有地址在 c/c++ 中是相同的

字符串文字可以是同一个对象,但不是必须的。

C++ 说:

(C++11, 2.14.5p12) “是否所有字符串文字都是不同的(即存储在不重叠的对象中)是实现定义的。尝试修改字符串文字的效果是未定义的。”

C 说:

(C11,6.5.2.5p7)“字符串文字和具有 const 限定类型的复合文字不需要指定不同的对象。101)这允许实现共享具有相同或重叠表示的字符串文字和常量复合文字的存储。”

C99 基本原理说:

“此规范允许实现共享具有相同文本的字符串副本,将字符串文字放置在只读内存中,并执行某些优化”

于 2013-06-28T17:09:36.440 回答
1

首先,这与操作系统无关。它仅取决于实现,即编译器。

其次,在这种情况下,您唯一可以希望的“保证”将来自编译器文档。语言的形式规则既不保证它们相同,也不保证它们不同。(后者适用于 C 和 C++。)

第三,一些编译器具有诸如“使字符串文字可修改”之类的奇怪选项。这通常意味着每个文字都分配在唯一的存储区域中并且具有唯一的地址。

于 2013-06-28T17:22:45.977 回答
0

In C, I believe the only guarantee about a string literal is that it will evaluate to a pointer to a readable area of memory that will, assuming a program does not engage in Undefined Behavior, always contain the indicated characters followed by a zero byte. The compiler and linker are allowed to work together in any fashion they see fit to make that happen. While I don't know of any compiler/linker systems that do this, it would be perfectly legitimate for a compiler to put each string literal in its own constant section, and for a linker to place such sections in reverse order of length, and check before placing each one whether the appropriate sequence of bytes had already been placed somewhere. Note that the sequence of bytes wouldn't even have to be a string literal or defined constant; if the linker is trying to place the string "Hi!" and it notices that machine code contains the sequence of bytes [0x48, 0x69, 0x21, 0x00], the literal could evaluate to a pointer to the first of those.

Note that writing to the memory pointed to by a string literal is Undefined Behavior. On various system a write may trap, do nothing, or affect only the literal written, but it could also have totally unpredictable consequences [e.g. if the literal evaluated to a pointer into some machine code].

于 2013-06-28T17:21:20.433 回答
0

它们都可以相同。甚至xy在下面也可以相同。z可以重叠y

const char *x = "hello";
const char *y = "hello\0folks";
const char *z = "folks";
于 2013-06-28T17:09:00.370 回答