NOTE: See my edits below.


Came across some curious behaviour which I cannot reconcile:

#if -5 < 0
#warning Good, -5 is less than 0.
#error BAD, -5 is NOT less than 0.

#if -(5u) < 0
#warning Good, -(5u) is less than 0.
#error BAD, -(5u) is less than 0.

#if -5 < 0u
#warning Good, -5 is less than 0u.
#error BAD, -5 is less than 0u.

When compiled:

$ gcc -Wall -o pp_test.elf pp_test.c
pp_test.c:2:6: warning: #warning Good, -5 is less than 0.
pp_test.c:10:6: error: #error BAD, -(5u) is less than 0.
pp_test.c:13:9: **warning: the left operand of "<" changes sign when promoted**
pp_test.c:16:6: error: #error BAD, -5 is less than 0u.

This suggests that the preprocessor follows different type promotion rules when evaluating constant integer expressions. Namely that, when an operator has operands of mixed sign, the signed operand is changed to an unsigned operand. The opposite is (generally) true in C.

I can find nothing in the literature to support this, but it's possible (likely?) that I haven't been thorough enough. Have I missed something? Is this behaviour correct?

As it stands, it seems at though any conditional expression in an #if or #elif directive which involves an explicitly unsigned integer constant may fail to behave as expected, i.e. as it would in C.

EDIT: As per my comments in Sourav Ghosh's answer, my confusion originally stemmed from expressions which included constants specified with L and LL suffixes. The example code I included in my original question was too simplified. Here is a better example:

#if -5LL < 0L
#warning Good, -5LL is less than 0L.
#error BAD, -5LL is NOT less than 0L.

#if -(5uLL) < 0L
#warning Good, -(5uLL) is less than 0L.
#error BAD, -(5uLL) is less than 0L.

#if -5LL < 0uL
#warning Good, -5LL is less than 0uL.
#error BAD, -5LL is less than 0uL.


$ gcc -Wall -o pp_test.elf pp_test.c
pp_test.c:2:6: warning: #warning Good, -5LL is less than 0L.
pp_test.c:10:6: error: #error BAD, -(5uLL) is less than 0L.
pp_test.c:13:9: warning: the left operand of "<" changes sign when promoted
pp_test.c:16:6: error: #error BAD, -5LL is less than 0uL.

This seems to violate the clause in subsequent to the one posted by Sourav Ghosh (my emphasis):

Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.

It seems to violate this clause because -5LL has a rank which is higher than 0uL, and because the type of the first (signed long long) can indeed represent all of the values of the type of the second (unsigned long). The catch is, the preprocessor doesn't know this.

As mentioned in https://gcc.gnu.org/onlinedocs/gcc-3.0.2/cpp_4.html (my emphasis):

The preprocessor calculates the value of expression. It carries out all calculations in the widest integer type known to the compiler; on most machines supported by GCC this is 64 bits. This is not the same rule as the compiler uses to calculate the value of a constant expression, and may give different results in some cases. If the value comes out to be nonzero, the `#if' succeeds and the controlled text is included; otherwise it is skipped.

What seems to be implied by "carries out all calculations in the widest integer type known to the compiler" is that the operands themselves are treated as though they are specified as that same 'widest' type. In other words, -5 and -5L are treated as though they are -5LL, and 0u and 0uL are treated as though they are 0uLL. This activates the clause quoted by Sourav Ghosh, and leads to the observed behaviour.

In effect, there is only one rank as far as the preprocesser is concerned, so type promotion rules which depend upon operands with different rank are ignored. Is this not indeed different from how the compiler evaluates expressions?

EDIT #2: Here's a real-world example of how the same expression is evaluated differently by the preprocessor than it is by the compiler (taken from Optiboot).

#ifndef BAUD_RATE
#if F_CPU >= 8000000L
#define BAUD_RATE   115200L
#elif F_CPU >= 1000000L
#define BAUD_RATE   9600L
#elif F_CPU >= 128000L
#define BAUD_RATE   4800L
#define BAUD_RATE 1200L

#ifndef UART
#define UART 0

#define BAUD_SETTING (( (F_CPU + BAUD_RATE * 4L) / ((BAUD_RATE * 8L))) - 1 )
#define BAUD_ACTUAL (F_CPU/(8 * ((BAUD_SETTING)+1)))

#if BAUD_ERROR >= 5
#error BAUD_RATE error greater than 5%
#elif (BAUD_ERROR + 5) <= 0
#error BAUD_RATE error greater than -5%
#elif BAUD_ERROR >= 2
#warning BAUD_RATE error greater than 2%
#elif (BAUD_ERROR + 2) <= 0
#warning BAUD_RATE error greater than -2%

volatile long long int baud_setting = BAUD_SETTING;
volatile long long int baud_actual = BAUD_ACTUAL;
volatile long long int baud_error = BAUD_ERROR;

void foo(void) {
  baud_setting = BAUD_SETTING;
  baud_actual = BAUD_ACTUAL;
  baud_error = BAUD_ERROR;

Building for an AVR target:

$ avr-gcc -Wall -c -g -save-temps -o optiboot_pp_test.elf -DF_CPU=8000000L optiboot_pp_test.c

Note how F_CPU was specified as a signed constant.

optiboot_pp_test.c:28:6: warning: #warning BAUD_RATE error greater than -2% [-Wcpp]
     #warning BAUD_RATE error greater than -2%

This works as expected. Examining the object file:

      baud_setting = BAUD_SETTING;
   8:   88 e0           ldi     r24, 0x08       ; 8
   a:   90 e0           ldi     r25, 0x00       ; 0
   c:   a0 e0           ldi     r26, 0x00       ; 0
   e:   b0 e0           ldi     r27, 0x00       ; 0
  10:   80 93 00 00     sts     0x0000, r24
  14:   90 93 00 00     sts     0x0000, r25
  18:   a0 93 00 00     sts     0x0000, r26
  1c:   b0 93 00 00     sts     0x0000, r27
      baud_actual = BAUD_ACTUAL;
  20:   87 e0           ldi     r24, 0x07       ; 7
  22:   92 eb           ldi     r25, 0xB2       ; 178
  24:   a1 e0           ldi     r26, 0x01       ; 1
  26:   b0 e0           ldi     r27, 0x00       ; 0
  28:   80 93 00 00     sts     0x0000, r24
  2c:   90 93 00 00     sts     0x0000, r25
  30:   a0 93 00 00     sts     0x0000, r26
  34:   b0 93 00 00     sts     0x0000, r27
      baud_error = BAUD_ERROR;
  38:   8d ef           ldi     r24, 0xFD       ; 253
  3a:   9f ef           ldi     r25, 0xFF       ; 255
  3c:   af ef           ldi     r26, 0xFF       ; 255
  3e:   bf ef           ldi     r27, 0xFF       ; 255
  40:   80 93 00 00     sts     0x0000, r24
  44:   90 93 00 00     sts     0x0000, r25
  48:   a0 93 00 00     sts     0x0000, r26
  4c:   b0 93 00 00     sts     0x0000, r27

... shows that the expected values are assigned. Namely, baud_setting gets 8, baud_actual gets 111111, and baud_error gets -3.

Now we build with F_CPU defined as an unsigned constant (as is customary on this target):

$ avr-gcc -Wall -c -g -save-temps -o optiboot_pp_test.elf -DF_CPU=8000000UL optiboot_pp_test.c 
optiboot_pp_test.c:22:6: error: #error BAUD_RATE error greater than 5%
     #error BAUD_RATE error greater than 5%

The reported error is of the wrong magnitude, and the wrong sign.

Examination of the object file shows it to be identical to the one built with a signed value for F_CPU.

None of this is a surprise now, with the understanding that the preprocessor treats all constants as either the signed or unsigned variant of the widest integer type.

The surprise is that this isn't explicitly mentioned in either the standard, nor the GCC docs (that I can find).

Yes, the C rules for evaluating operands are followed exactly by the preprocessor, but only insofar as the case where both operands of a binary operator are of the same rank. I cannot find any text in the standard which states that the preprocessor treats all constants specified with or without L or LL as though they were all LL before the rules for integer promotions specified in are enforced, nor can I find any mention of this behaviour in the GCC docs. The closest is the passage from the GCC docs quoted above stating that the preprocessor "carries out all calculations in the widest integer type known to the compiler".

This does not (should not) explicitly mean that the operands are treated as though they were specified with suffixes designating them as the widest integer type known to the compiler. Indeed, absent an explicit passage on the subject, my expectation would be that the operands would be subject to the same type conversion and integer promotion rules to which all operands are subject when evaluated by the compiler. This doesn't seem to be the case. The implication, based on the tests above, is that the application of the normal C integer promotion rules comes after the preprocessor promotes the operands to the widest (signed or unsigned) integer type known to the compiler.

If someone can show any explicit and relevant text on this subject, either from the standard or the GCC docs, I'm interested.

EDIT #3: note: I've copied the below paragraphs from the comments section into the post itself, since there were too many comments for it to be seen.

If someone can show any explicit and relevant text on this subject, either from the standard or the GCC docs, I'm interested.

Here's some text from 6.10.1:

  1. For the purposes of this token conversion and evaluation, all signed integer types and all unsigned integer types act as if they have the same representation as, respectively, the types intmax_t and uintmax_t defined in the header <stdint.h>.

That would seem to clinch it.


3 回答 3


引用标准第 章中的常用算术转换规则(强调我的) 。C11

否则,如果无符号整数类型的操作数的等级大于或等于另一个操作数类型的等级,则将有 符号整数类型的操作数转换为无符号整数类型的操作数的类型。



于 2015-07-21T16:56:04.713 回答


这基本上会导致 - 对于您的示例,您将相同等级的有符号和无符号混合在一起 - 将有符号转换为无符号表示,反之亦然。因此,后两者的比较是无符号的。这对于预处理器和实际编译器是相同的。,对于 2s 补码有符号表示(如今在标准 CPU 中最常见)意味着有符号整数值的二进制表示被简单地重新解释为无符号(正)值,因此比较都失败了。

请注意,您应该启用-Wconversions(gcc) 以查看有关此类有问题的转换的警告。

于 2015-07-21T16:59:19.820 回答

在某些极少数情况下,预处理器对数字常量值的解释可能与 C 不同,这是将所有整数值视为最广泛的可用有符号或无符号类型的副作用,无论宽度说明符如何。然而,给定生成的类型数值,其评估条件表达式的规则与 C 的规则明确相同:

生成的标记组成控制常量表达式,根据 [Section] 6.6 的规则进行评估。

(C99,第 6.10.1 节)

第 6.6 节介绍了 C 的常量表达式规则,其中(在第 11 段)



于 2015-07-21T17:11:03.780 回答