由于 C 中定义转换和操作的方式,使用有符号变量还是无符号变量似乎无关紧要:
uint8_t u; int8_t i;
u = -3; i = -3;
u *= 2; i *= 2;
u += 15; i += 15;
u >>= 2; i >>= 2;
printf("%u",u); // -> 2
printf("%u",i); // -> 2
那么,是否有一套规则来说明在哪些条件下变量的符号性真的会产生影响?
由于 C 中定义转换和操作的方式,使用有符号变量还是无符号变量似乎无关紧要:
uint8_t u; int8_t i;
u = -3; i = -3;
u *= 2; i *= 2;
u += 15; i += 15;
u >>= 2; i >>= 2;
printf("%u",u); // -> 2
printf("%u",i); // -> 2
那么,是否有一套规则来说明在哪些条件下变量的符号性真的会产生影响?
在这些情况下很重要:
-2/2 = 1
, -2u/2 = UINT_MAX/2-1
, -3%4 = -3
,-3u%4 = 1
>>
分别<<
是实现定义或未定义。对于无符号值,它们总是被定义。-2 < 0
,-2u > 0
x+1 > x
编译器可以假定它始终为真,如果 x
有符号类型。是的。签名会影响 C 中大于和小于运算符的结果。考虑以下代码:
unsigned int a = -5;
unsigned int b = 7;
if (a < b)
printf("Less");
else
printf("More");
在这个例子中,“More”被错误地输出,因为 -5 被编译器转换为一个非常高的正数。
这也会影响您使用不同大小变量的算术。再次考虑这个例子:
unsigned char a = -5;
signed short b = 12;
printf("%d", a+b);
返回的结果是263,而不是预期的 7。这是因为 -5 实际上被编译器视为 251。溢出使您的操作对相同大小的变量正常工作,但是在扩展时,编译器不会为无符号变量扩展符号位,因此它将它们视为它们在较大空间中的原始正表示。研究两个的恭维是如何工作的,你会看到这个结果是从哪里来的。
它会影响您可以存储在变量中的值的范围。
它主要是比较相关的。
printf("%d", (u-3) < 0); // -> 0
printf("%d", (i-3) < 0); // -> 1
无符号整数上的溢出只是环绕。在有符号值上,这是未定义的行为,一切都可能发生。
The signedness of 2's complement numbers is simply just a matter of how you are interpreting the number. Imagine the 3 bit numbers:
000
001
010
011
100
101
110
111
If you think of 000
as zero and the numbers as they are natural to humans, you would interpret them like this:
000: 0
001: 1
010: 2
011: 3
100: 4
101: 5
110: 6
111: 7
This is called "unsigned integer". You see everything as a number bigger than/equal to zero.
Now, what if you want to have some numbers as negative? Well, 2's complement comes to rescue. 2's complement is known to most people as just a formula, but in truth it's just congruency modulo 2^n where n is the number of bits in your number.
Let me give you a few examples of congruency:
2 = 5 = 8 = -1 = -4 module 3
-2 = 6 = 14 module 8
Now, just for convenience, let's say you decide to have the left most bit of a number as its sign. So you want to have:
000: 0
001: positive
010: positive
011: positive
100: negative
101: negative
110: negative
111: negative
Viewing your numbers congruent modulo 2^3 (= 8), you know that:
4 = -4
5 = -3
6 = -2
7 = -1
Therefore, you view your numbers as:
000: 0
001: 1
010: 2
011: 3
100: -4
101: -3
110: -2
111: -1
As you can see, the actual bits for -3 and 5 (for example) are the same (if the number has 3 bits). Therefore, writing x = -3
or x = 5
gives you the same result.
Interpreting numbers congruent modulo 2^n has other benefits. If you sum 2 numbers, one negative and one positive, it could happen on paper that you have a carry that would be thrown away, yet the result is still correct. Why? That carry was a 2^n which is congruent to 0 modulo 2^n! Isn't that convenient?
Overflow is also another case of congruency. In our example, if you sum two unsigned numbers 5 and 6, you get 3, which is actually 11.
So, why do you use signed and unsigned? For the CPU there is actually very little difference. For you however:
So, for example if you assign -1 to a an unsigned number, it's the same as assigning 2^n-1 to it.
As per your example, that's exactly what you are doing. you are assigning -3 to a uint8_t, which is illegal, but as far as the CPU is concerned you are assigning 253 to it. Then all the rest of the operations are the same for both types and you end up getting the same result.
There is however a point that your example misses. operator >>
on signed number extends the sign when shifting. Since the result of both of your operations is 9 before shifting you don't notice this. If you didn't have the +15, you would have -6 in i
and 250 in u
which then >> 2
would result in -2
in i
(if printed with %u, 254) and 62 in u
. (See Peter Cordes' comment below for a few technicalities)
To understand this better, take this example:
(signed)101011 (-21) >> 3 ----> 111101 (-3)
(unsigned)101011 ( 43) >> 3 ----> 000101 ( 5)
If you notice, floor(-21/8) is actually -3 and floor(43/8) is 5. However, -3 and 5 are not equal (and are not congruent modulo 64 (64 because there are 6 bits))