我发现 C99 标准有一个声明否认 char 类型和有符号字符/无符号字符类型之间的兼容性。
C99标准注释35:
在limits.h 中定义的CHAR_MIN 将具有值0 或SCHAR_MIN 之一,这可用于区分这两个选项。无论做出何种选择, char 都是与其他两种不同的类型,并且与任何一种都不兼容。
我的问题是,为什么委员会否认兼容性?理由是什么?如果 char 与 signed char 或 unsigned char 兼容,会发生可怕的事情吗?
我发现 C99 标准有一个声明否认 char 类型和有符号字符/无符号字符类型之间的兼容性。
C99标准注释35:
在limits.h 中定义的CHAR_MIN 将具有值0 或SCHAR_MIN 之一,这可用于区分这两个选项。无论做出何种选择, char 都是与其他两种不同的类型,并且与任何一种都不兼容。
我的问题是,为什么委员会否认兼容性?理由是什么?如果 char 与 signed char 或 unsigned char 兼容,会发生可怕的事情吗?
The roots are in compiler history. There were (are) essentially two C dialects in the Eighties:
Which of these should C89 have standardized? C89 chose to standardize neither, because it would have invalidated a large number of assumptions made in C code already written--what standard folks call the installed base. So C89 did what K&R did: leave the signedness of plain char implementation-defined. If you required a specific signedness, qualify your char.
Modern compilers usually let you chose the dialect with an option (eg. gcc's -funsigned-char
).
The "terrible" thing that can happen if you ignore the distinction between (un)signed char and plain char is that if you do arithmetic and shifts without taking these details into account, you might get sign extensions when you don't expect them or vice versa (or even undefined behavior when shifting).
There's also some dumb advice out there that recommends to always declare your chars with an explicit signed or unsigned qualifier. This works as long as you only work with pointers to such qualified types, but it requires ugly casts as soon as you deal with strings and string functions, all of which operate on pointer-to-plain-char, which is assignment-incompatible without a cast. Such code suddenly gets plastered with tons of ugly-to-the-bone casts.
The basic rules for chars are:
char
for strings and if you need to pass pointers to functions taking plain charunsigned char
if you need to do bit twiddling and shifting on bytessigned char
if you need small signed values, but think about using int
if space is not a concern将signed char
andunsigned char
视为最小的算术整数类型,就像signed short
/ unsigned short
,等等int
, long int
, long long int
。这些类型都是明确指定的。
另一方面,char
它有一个非常不同的目的:它是 I/O 的基本类型以及与系统的通信。它不是用于计算,而是作为数据单位。这就是为什么你会char
在命令行参数、“字符串”的定义、FILE*
函数和其他读/写类型的 IO 函数以及严格别名规则的例外中发现使用它的原因。这种char
类型的定义故意不那么严格,以便允许每个实现使用最“自然”的表示。
这只是分离职责的问题。
(不过,确实与andchar
的布局兼容,因此您可以显式地将一个转换为另一个并返回。)signed char
unsigned char