2

是否可以以查找表的形式进行浮点除法的倒数(例如 1/f -> 1*inv[f] )?怎么可能做到?我认为 some 和 mask 和 shift 应该应用于 float 以使其成为一种索引形式?究竟会怎样?

4

3 回答 3

7

You can guess an approximate inverse like this:

int x = bit_cast<int>(f);
x = 0x7EEEEEEE - x;
float inv = bit_cast<float>(x);

In my tests, 0x7EF19D07 was slightly better (tested with the effects of 2 Newton-Raphson refinements included).

Which you can then improve with Newton-Raphson:

inv = inv * (2 - inv * f);

Iterate as often as you want. 2 or 3 iterations give nice results.

Better Initial Approximations

To minimize the relative error:

  • 0x7EF311C2 (without refinement)
  • 0x7EF311C3 (1 refinement)
  • 0x7EF312AC (2 refinements)
  • 0x7EEEEBB3 (3 refinements)

To minimize the absolute error for inputs between 1 and 2 (they work well enough outside that range, but they may not be the best):

  • 0x7EF504F3 (without refinement)
  • 0x7EF40D2F (1 refinement)
  • 0x7EF39252 (2 refinements)

For three refinement steps, the initial approximation barely affects the maximum relative error. 0x7EEEEEEE works great, and I couldn't find anything better.

于 2012-09-01T13:38:41.117 回答
4

一种方法是:

  1. 从输入中提取符号、指数和尾数
  2. 使用一些最重要的尾数位在表中查找其倒数
  3. 取反指数,并根据尾数比例的变化进行调整
  4. 重新组合符号、指数和尾数以形成输出

In step 2, you'll need to choose the number of bits to use, trading between accuracy and table size. You could obtain more accuracy by using the less significant bits to interpolate between table entries.

In step 3, the adjustment is necessary because the input mantissa was in the range (0.5, 1.0], and so its reciprocal is in the range [1.0, 2.0), which needs renormalising to give the output mantissa.

I won't try to write the code for this, since there are probably some slightly fiddly edge cases that I'd miss.

You should also investigate methods involving numerical calculations, which might give better results if memory access is slow; on a modern PC architecture, a cache miss might be as expensive as dozens of arithmetic operations. Wikipedia looks like a good starting point. And of course, whatever you do, measure it to make sure it is actually faster than an FPU division operation.

于 2012-09-01T11:39:51.063 回答
1

如果您的最小步长类似于 0.01,那么您可以支持表格中的 inverse-f。每个索引乘以 100,因此您可以拥有

table[1]----->1.0/0.01
table[3]----->1.0/0.03
table[105]--->1.0/1.05
...
table[10000]->1.0/100.0


10000 elements for a range of (0.00,100.00)

如果你想要更好的精度,你将需要更多的内存。

另一个例子:

range................: 0.000 - 1000.000
minimum increments ..: 0.001
total element number.: 1 million

something like this: table[2343]=1.0/2.343

另一个例子:

range................: 0.000000 - 1.000000
minimum increments ..: 0.000001
total element number.: 1 million

something like this: table[999999]=1.0/0.999999
于 2012-09-01T11:07:24.493 回答