c - 在字符串上使用指针

Question

我对在字符串上使用指针感到非常困惑。感觉他们遵守不同的规则。考虑以下代码

char *ptr = "apple";// perfectly valid here not when declaring afterwards like next

ptr = "apple"; // shouldn't it be *ptr = "apple"

也printf()表现不同 -

printf("%s", ptr) // Why should I send  the address instead of the value

我也在书中遇到了以下代码

char str[]="Quest";
char *p="Quest";

str++; // error, constant pointer can't change

*str='Z'; // works, because pointer is not constant

p++; // works, because pointer is not constant

*p = 'M'; // error, because string is constant

我不明白应该暗示什么

请帮忙，我在其他任何地方都找不到任何信息

score 5 · Accepted Answer

char *ptr;
ptr = "apple"; // shouldn't it be *ptr =      "apple"

不，因为*ptr将是一个char. 所以，你可以写*ptr = 'a'，但你不能按照你的建议写。

printf("%s", ptr) // Why should I send  the address instead of the value

因为 C 中的字符串是字符序列 ( char) 的地址，以零结尾（空字符 aka \x0）。

char str[] = "Quest";
char *p = "Quest";

str++; // error, constant pointer can't change

不，指针可以完美改变，但这里str是一个数组（与指针略有不同）。但是，因此，它不能处理指针算术。

*str='Z'; // works, because pointer is not constant

不，它有效，因为*str应该是char.

p++; // works, because pointer is not constant

不，它有效，因为这一次，这是一个指针（不是数组）。

*p = 'M'; // error, because string is constant

和上面一样，这又是一个char，所以它起作用是因为它是正确的类型，而不是因为字符串是“常量”。并且，正如 Michael Walz 在评论中所说，即使它可以编译，它也会在运行时产生未定义的行为（很可能会导致崩溃segfault），因为规范没有说明所指向的字符串*p是否是只读的（然而，似乎大多数现代编译器实现都决定将其设为只读）。这可能会产生一个segfault.

有关更多信息，请参阅此 SO question。

score 3 · Accepted Answer

“*”与指针一起使用时表示获取指针指向的内容，在以下情况下：

    char* ptr;

ptr 是指向字符的指针，您可以将其分配给字符串，如下所示：

   const char* ptr = "test";

内存中的布局是“t”，然后是“e”、“s”、“t”，最后是一个 nul 终止符 '\0'。

当您像上面那样分配它时，它会将指针分配给恰好是“t”的第一个内存位置。

*ptr 返回 ptr 指向的内容，并且始终是它声明的类型的大小，在本例中为“char”，单字节。

*(++ptr) 将返回“e”，因为 ptr 在返回它现在指向的内容之前递增到下一个位置。

score 3 · Accepted Answer

1-我认为您对变量声明和定义有些混淆。这一行：

char *ptr = "apple";

声明一个指向 char 的指针并将第一个字符“a”的地址分配给变量 ptr。此行等效于以下 2：

char* ptr;
ptr = "apple";

现在，C 中的字符串文字是只读的。它们是隐式不变的，这和做的一样

const char* ptr;

所以事实上，你不能改变这个指针指向的位置的内容。现在，即使你可以，你做的方式也是错误的。因为 ptr 指向字符串的第一个字符的位置，所以当您执行 *ptr 时，您正在访问该字符串的第一个字符的内容。所以它需要一个字符，而不是字符串。所以它会是这样的*ptr = 'a'：

2-嗯，这就是 printf 的工作方式。如果要打印带有 %s 说明符的字符串，它需要一个指向该字符串的指针，即字符串第一个字符的地址，而不是字符串的值本身。

3-现在我要评论你的代码。

str++; // error, constant pointer can't change

你是对的。其他人一直说数组和指针略有不同，但事实并非如此。数组只是程序员的一种抽象，表示您正在存储一系列值。在装配级别，根本没有区别。您可以说数组是具有可变内容的不可变指针。数组存储值序列的第一个元素的地址。您可以更改数组的内容，但不能更改地址（它指向的第一个元素）。

*str='Z'; // works, because pointer is not constant

现在你正在制造一些混乱。指针实际上是常量，也就是说，你不能改变它存储的地址。但是您可以更改地址指向的内容，这就是上面的行所做的。它正在更改数组中值序列的第一个值。

p++; // works, because pointer is not constant

正确的。指针不是常量，尽管它指向的内容是常量。您可以更改指针存储的地址，但不能更改它指向的值。字符串文字是指向不可变字符串的可变指针。

*p = 'M'; // error, because string is constant

正确，字符串是不可变的。

score 2 · Accepted Answer

"SOME STRING"在内存中创建一个以结尾的字符序列\0并返回其第一个字符地址，以便您可以将其分配给指针：
char *ptr = "Hello";
printf函数也适用于地址，类型说明符定义它应该如何从内存中读取数据。
char str[]="Quest"; char *p="Quest";
在第一个中，您正在创建一个包含 6 个房间并存储'Q', 'u', 'e', 's', 't', '\0'在其中的数组，然后您可以更改一些索引值，str[2] = 'x'但数组名称本身是一个常量，其地址指向它指向的第一个位置，因此您不能用类似的东西改变它str++;
但是在第二个"Quest\0"中是一个常量字符串保存到内存中的某个位置，它的第一个内存位置存储在p所以你不能改变它但是指针本身不是 aconst并且你可以做p++;。

score 2 · Accepted Answer

我只会回答子问题 1。但是您已经触及了 C 语言中一个常见但微妙的混淆，即初始化指针的方式与分配给该指针的方式之间略有不匹配。仔细观察。

如果我有一个int变量，我可以在声明它时对其进行初始化：

int i = 42;

或者，我可以在一行上声明它（不初始化它），然后给它一个值：

int i;
i = 42;

那里没有奥秘。但是当涉及到指针时，它看起来有点不同。同样，我可以在一行上声明和初始化：

char *ptr = "apple";

或者我可以拆分声明和分配：

char *ptr;
ptr = "apple";

但是，一开始这看起来很奇怪——基于第一种语法，第二种方式不应该是这样的吗？

*ptr = "apple";         // WRONG

不，不应该，这就是原因。

ptr是指向某些字符的指针。这是在 C 中引用字符串的一种方式。

*是指针间接运算符。在表达式中，*ptr指的是指向的字符（只是一个字符）ptr。因此，如果我们想获取字符串的第一个字符，我们可以*这样做：

printf("first character: %c\n", *ptr);

请注意，此printf调用中的格式使用%c，因为它只打印一个字符。

我们也可以分配指针。如果我们使用指向的指针char，并且因此将这些指针视为“字符串”，那么这是在 C 中进行字符串赋值的一种方法。如果我说

ptr = "apple";

那么无论ptr以前指向哪里，现在它都指向一个包含字符串“apple”的字符数组。如果我稍后说

ptr = "pear";

thenptr不再指向字符串“apple”；现在它指向包含字符串“pear”的不同字符数组。你可以把这个指针想象成一次分配字符串的所有字符（尽管它实际上根本不是这样做的）。

所以如果*ptr只访问一个字符，并且ptr是指针值本身，那么为什么第一种形式

char *ptr = "apple";

工作？

答案是当你说

char *ptr = "apple";

*那里显示的不是指针间接运算符。这并不是说我们正在尝试访问任何内容的第一个字符。

当你说

char *ptr = "apple";

*也就是说，这是ptr一个指针。就像你说的那样

char *ptr;

*也就是说，这是ptr一个指针。

指针的 C' 声明语法有点奇怪。这是如何考虑的。语法是

类型名称 事物具有那种类型 ;

所以当我们说

char *ptr;

type-name是char，而thing -that-has-that-type是*ptr。我们说这*ptr将是一个char. 如果*ptr将是 a char，则意味着 thatptr必须是指向- 的指针char。

然后，当我们说

char *ptr = "apple";

我们是说ptr（我们刚刚说的是指向-的指针char）应该有一个指向包含字符串“apple”的数组的指针作为其初始值。

score 2 · Accepted Answer

ptr = "apple"; // shouldn't it be *ptr = "apple"

Starting from the beginning...

The string literal "apple" is stored in a 6-element array of char, like so:

+---+---+---+---+---+---+
|'a'|'p'|'p'|'l'|'e'| 0 |
+---+---+---+---+---+---+

The trailing 0 marks the end of the string (it's called the string terminator).

When an expression of type "N-element array of T" appears in an expression, it will be converted ("decay") to an expression of type "pointer to T" and the value of the expression will be the address of the first element of the array, unless the array expression is the operand of the sizeof or unary & operators, or is used to initialize a character array in a declaration.

Thus, in the statement

ptr = "apple";

the expression "apple" is converted ("decays") from an expression of type "6-element array of char" to "pointer to char". The type of the expression ptr is char *, or "pointer to char"; thus, in the assignment above, ptr will receive the address of the first element of "apple".

It should not be written as

*ptr = "apple";

since the expression *ptr evaluates to the value of the thing ptr points to, which at this point is a) indeterminate, and b) the wrong type for the assignment. The type of the expression *ptr is char, which is not compatible with char *.

I've written a utility that prints a map of items in memory; given the code

char *ptr = "apple";
char arr[] = "apple";

the map looks something like this:

       Item         Address   00   01   02   03
       ----         -------   --   --   --   --
      apple        0x400c80   61   70   70   6c    appl
                   0x400c84   65   00   70   74    e.pt

        ptr  0x7fffcb4d4518   80   0c   40   00    ..@.
             0x7fffcb4d451c   00   00   00   00    ....

        arr  0x7fffcb4d4510   61   70   70   6c    appl
             0x7fffcb4d4514   65   00   00   00    e...

The string literal "apple" lives at address 0x400c80¹. The variables ptr and arr live at addresses 0x7fffcb4d4518 and 0x7fffcb4d4510, respectively².

The variable ptr contains the value 0x400c80, which is the address of the first element of the "apple" string literal (x86 stores multi-byte values in "little-endian" order, so the least-significant byte comes first, meaning you have to read right-to-left).

Remember the "except" clause above? In the second declaration, the string literal "apple" is being used to initialize an array of char in a declaration; instead of being converted to a pointer value, the contents of the string literal are copied to the array, which you can see in the memory dump.

printf("%s", ptr) // Why should I send the address instead of the value

Because that's what the %s conversion specifier expects - it takes a pointer to the first character of a 0-terminated string, and will print out the sequence of characters starting at that location until it sees the terminator.

3 ... I can't understand what is supposed to imply

You cannot change the value of an array object. Let's look at what str would look like in memory:

     +---+
str: |'Q'| str[0]
     +---+
     |'u'| str[1]
     +---+
     |'e'| str[2]
     +---+
     |'s'| str[3]
     +---+
     |'t'| str[4]
     +---+
     | 0 | str[5]
     +---+

You can write to each str[i]³ (changing its value), but you cannot write to str because there's nothing to write to. There's no str object separate from the array elements. Even though the expression str will "decay" to a pointer value, no storage is set aside anywhere for that pointer - the conversion is done at compile time.

Similarly, attempting to modify the contents of a string literal invokes undefined behavior⁴; you may get a segfault, or your code may work as expected, or you may wind up launching nukes at Liechtenstein. So you can't write to *p or p[i]; however, you can write a new value to p, pointing it to a different location.

^{Techically, it's 0x0000000000400c80; the %p specifier drops leading zeros.
Same deal - technically, the values are 0x000000007fffcb4d4518 and 0x000000007fffcb4d4510. Note that the specific address values will change from run to run.
*str is equivalent to str[0]
The C language definition identifies certain operations which are erroneous, but doesn't place any requirements on the compiler to handle that code in any particular way. Different platforms store string literals in different ways; some put them in read-only memory, so attempting to modify them results in a segfault, while other platforms store them in a writable segment so that the operation succeeds. Some may store them in such a way that you don't get a segfault, but the string isn't changed.}

c - 在字符串上使用指针

6 回答 6

Related

Reference