推理某些表达式的数值精度的正确方法是:
- 测量相对于ULP中正确值的结果差异(最后一个单位),由 WH Kahan 于 1960 年引入。您可以在此处找到 C、Python 和 Mathematica 实现,并在此处了解有关该主题的更多信息。
- 根据它们产生的最坏情况来区分两个或多个表达式,而不是像其他答案或其他任意度量那样的平均绝对误差。这就是如何构造数值逼近多项式(Remez 算法),如何分析标准库方法的实现(例如 Intel atan2)等...
考虑到这一点,version_1: sqrt(1 - x * x) 和 version_2: sqrt((1 - x) * (1 + x)) 产生明显不同的结果。如下图所示,对于 x 接近 1 且错误 > 1_000_000 ulps,version_1 表现出灾难性的性能,而另一方面,version_2 的错误表现良好。
这就是为什么我总是推荐使用version_2,即利用平方差公式。
产生 square_diff_error.csv 文件的 Python 3.6 代码:
from fractions import Fraction
from math import exp, fabs, sqrt
from random import random
from struct import pack, unpack
def ulp(x):
"""
Computing ULP of input double precision number x exploiting
lexicographic ordering property of positive IEEE-754 numbers.
The implementation correctly handles the special cases:
- ulp(NaN) = NaN
- ulp(-Inf) = Inf
- ulp(Inf) = Inf
Author: Hrvoje Abraham
Date: 11.12.2015
Revisions: 15.08.2017
26.11.2017
MIT License https://opensource.org/licenses/MIT
:param x: (float) float ULP will be calculated for
:returns: (float) the input float number ULP value
"""
# setting sign bit to 0, e.g. -0.0 becomes 0.0
t = abs(x)
# converting IEEE-754 64-bit format bit content to unsigned integer
ll = unpack('Q', pack('d', t))[0]
# computing first smaller integer, bigger in a case of ll=0 (t=0.0)
near_ll = abs(ll - 1)
# converting back to float, its value will be float nearest to t
near_t = unpack('d', pack('Q', near_ll))[0]
# abs takes care of case t=0.0
return abs(t - near_t)
with open('e:/square_diff_error.csv', 'w') as f:
for _ in range(100_000):
# nonlinear distribution of x in [0, 1] to produce more cases close to 1
k = 10
x = (exp(k) - exp(k * random())) / (exp(k) - 1)
fx = Fraction(x)
correct = sqrt(float(Fraction(1) - fx * fx))
version1 = sqrt(1.0 - x * x)
version2 = sqrt((1.0 - x) * (1.0 + x))
err1 = fabs(version1 - correct) / ulp(correct)
err2 = fabs(version2 - correct) / ulp(correct)
f.write(f'{x},{err1},{err2}\n')
产生最终图的 Mathematica 代码:
data = Import["e:/square_diff_error.csv"];
err1 = {1 - #[[1]], #[[2]]} & /@ data;
err2 = {1 - #[[1]], #[[3]]} & /@ data;
ListLogLogPlot[{err1, err2}, PlotRange -> All, Axes -> False, Frame -> True,
FrameLabel -> {"1-x", "error [ULPs]"}, LabelStyle -> {FontSize -> 20}]