c - 浮点转换的 int 如何适用于大量数字？

Question

如果我们将整数转换为浮点数，当它变得太大而无法用浮点数精确表示时，需要对其进行舍入或截断。这是一个小测试程序来看看这个四舍五入。

#include <stdio.h>

#define INT2FLOAT(num) printf(" %d: %.0f\n", (num), (float)(num));

int main(void)
{
    INT2FLOAT((1<<24) + 1);
    INT2FLOAT((1<<24) + 2);
    INT2FLOAT((1<<24) + 3);
    INT2FLOAT((1<<24) + 4);
    INT2FLOAT((1<<24) + 5);
    INT2FLOAT((1<<24) + 6);
    INT2FLOAT((1<<24) + 7);
    INT2FLOAT((1<<24) + 8);
    INT2FLOAT((1<<24) + 9);
    INT2FLOAT((1<<24) + 10);

    return 0;
}

输出是：

 16777217: 16777216
 16777218: 16777218
 16777219: 16777220
 16777220: 16777220
 16777221: 16777220
 16777222: 16777222
 16777223: 16777224
 16777224: 16777224
 16777225: 16777224
 16777226: 16777226

两个可表示整数之间的中间值有时会向上取整，有时会向下取整。似乎应用了某种四舍五入。这究竟是如何工作的？我在哪里可以找到进行此转换的代码？

score 5 · Accepted Answer

这种隐式转换的行为是实现定义的：（C11 6.3.1.4/2）：

如果要转换的值在可以表示但不能精确表示的值范围内，则结果是最接近的较高或最近的较低可表示值，以实现定义的方式选择。

这意味着您的编译器应该记录它是如何工作的，但您可能无法控制它。

将浮点源舍入为整数时，有各种函数和宏用于控制舍入方向，但我不知道将整数转换为浮点的情况。

score 2 · Accepted Answer

除了其他答案中所说的之外，例如，英特尔浮点单元在内部使用完整的 80 位浮点表示，并且位数过多......所以当它将数字四舍五入到最接近的 23 位float数字时（正如我从您的输出中假设的那样）认为它能够非常精确并考虑int.

IEEE-752 将 32 位浮点数指定为具有 23 位专用于存储有效位的数字，这意味着，对于规范化数字，其中最高有效位是隐式的（不存储，因为它总是一个1位）你实际上有形式的 24 位有效1xxxxxxx_xxxxxxxx_xxxxxxxx数字，这意味着该数字2^24-1是您能够准确表示的最后一个数字（11111111_11111111_11111111实际上）。之后，您可以表示所有偶数，但不能表示赔率，因为您缺少表示它们的最低有效位。这应该意味着您能够代表：

                                                     v decimal dot.
16777210  == 2^24-6        11111111_11111111_11111010.
16777211  == 2^24-5        11111111_11111111_11111011.
16777212  == 2^24-4        11111111_11111111_11111100.
16777213  == 2^24-3        11111111_11111111_11111101.
16777214  == 2^24-2        11111111_11111111_11111110.
16777215  == 2^24-1        11111111_11111111_11111111.
16777216  == 2^24         10000000_00000000_00000000_. <-- here the leap becomes 2 as there are no more than 23 bits to play with.
16777217  == 2^24+1       10000000_00000000_00000000_. (there should be a 1 bit after the last 0)
16777218  == 2^24+2       10000000_00000000_00000001_.
...
33554430  == 2^25-2       11111111_11111111_11111111_.
33554432  == 2^26        10000000_00000000_00000000__. <-- here the leap becomes 4 as there's another shift
33554436  == 2^26+4      10000000_00000000_00000001__.
...

如果你想象以 10 为底的问题，假设我们的浮点数只有 3 位十进制数字的有效位，以及 10 的指数来提高幂。当我们从开始计数时0，我们得到：

  1  => 1.00E0
...
  8  => 8.00E0
  9  => 9.00E0
 10  => 1.00E1  <<< see what happened here... this is the same number as the first but with the ten's exponent incremented, meaning a one digit shift of every digit to the left.
 11  => 1.10E1
...
 98  => 9.80E1
 99  => 9.90E1
100  => 1.00E2  <<< and here.
101  => 1.01E2
...
996  => 9.96E2
997  => 9.97E2
998  => 9.98E2
999  => 9.99E2
1000 => 1.00E3  <<< exact, but here you don't have anymore a fourth digit to represent units.
1001 => 1.00E3  (this number cannot be represented exactly)
...
1004 => 1.00E3  (this number cannot be represented exactly)
1005 => 1.01E3  (this number cannot be represented exactly) <<< here rounding is applied, but the implementation is free to do whatever it wants.
...
1009 => 1.01E3  (this number cannot be represented exactly)
1010 => 1.01E3 <<< this is the next number that can be represent exactly with three floating point digits.  So we switched from an increment of one by one to an increment of ten by ten.
...

笔记

您展示的情况是为英特尔处理器指定的舍入模式之一，它会舍入到更接近的偶数，但如果是距离的一半，它会计算有效数字中一位的数量并在它是时四舍五入奇数，偶数时舍入（这是为了避免舍入有时在银行业务中总是如此重要——银行从不使用浮点数，因为他们没有精确控制舍入）

c - 浮点转换的 int 如何适用于大量数字？

2 回答 2

笔记

Related

Reference