2

为什么乘法的顺序会影响结果?考虑以下代码

a=47.215419672114173
b=-0.45000000000000007
c=-0.91006620964286644
result1=a*b*c
temp=b*c
result2=a*temp
result1==result2

我们都知道result1 应该等于result2 但是我们得到:

result1==result2 #FALSE!

差异很小

result1-result2 #3.552713678800501e-15

但是,对于特定应用程序,此错误可能会放大,因此执行相同计算的两个程序的输出(一个使用结果 1,另一个使用结果 2)可能完全不同。

为什么会这样?在大量的数值/科学应用中可以做些什么来解决这些问题?

谢谢!

更新

很好的答案,但我仍然想念乘法顺序很重要的原因,例如

temp2=a*b
result3=temp2*c
result1==result3 #True

因此,编译器/解释器似乎将 a*b*c 视为 (a*b)*c

4

8 回答 8

9

在将浮点数从十进制表示转换为二进制表示时,所有编程语言都会丢失精度。这会导致计算不准确(至少从以 10 为底的角度来看,因为数学实际上是在以二进制表示的浮点值上完成的),包括运算顺序改变结果的情况。大多数语言都提供了一种数据结构来保持以 10 为底的精度,但会以性能为代价。Decimal在 Python 中查看。

编辑:

在回答您的更新时,不完全是。计算机按顺序做事,因此当您为它们提供一系列操作时,它们会按顺序逐一进行。除了顺序命令处理之外,没有明确的操作顺序。

于 2012-07-24T20:23:00.560 回答
6

When you use floating point numerals in any programming language, you will lose precision. You can either:

Accomodate for the loss of precision, and adjust your equality checks accordingly, as follows:

 are_equal = (result1-result2)>0.0001

Where the 0.0001 (epsilon) is a value you set.

Or use the Decimal class provided with python, which is a bit slower.

于 2012-07-24T20:26:09.110 回答
6

每次乘法都会产生两倍于原始数字的数字(或位),并且需要进行四舍五入,以使其适合为浮点数分配的空间。当您重新排列订单时,这种舍入可能会改变结果。

于 2012-07-24T20:46:07.460 回答
3

float comparison should always be done (by you) with a small epsilon like 10^-10

于 2012-07-24T20:24:39.053 回答
1

We all know that result1 should be equal to result2, however we get:

No, we don't all know that. In fact, they should not be equal, which is why they aren't equal.

You seem to believe that you are working with real numbers. You aren't - you are working with IEEE floating point representations. They don't follow the same axioms. They aren't the same thing.

The order of operations matters because python evaluates each expression, which results in a floating point number.

于 2012-07-24T20:24:48.673 回答
1

Representing numbers in computers is a big research area in computer science. It is not a problem present only in python but any programming language has this property, since by default it would be too expensive to perform ANY calculation arbitrary accurate.

The numerical stability of an algorithm reflects some of the limitations while thinking numerical algorithms. As said before, Decimal is defined as a standard to perform precise calculations in banking applications or any application that might need it. In python, there's an implementation of this standard.

于 2012-07-24T21:16:30.977 回答
1

原因:可能您的机器/Python 无法处理这么多的准确度。请参阅:http ://en.wikipedia.org/wiki/Machine_epsilon#Approximation_using_Python

怎么办:这应该有帮助:http ://packages.python.org/bigfloat/

于 2012-07-24T20:31:16.960 回答
0

正如在以前的帖子中很好地回答的那样,这是编程语言中常见的浮点算术问题。你应该知道永远不要对float类型应用完全相等。

当您进行此类比较时,您可以使用基于给定容差(阈值)进行比较的函数。如果数字足够接近,则应将它们视为数量相等。就像是:

def isequal_float(x1,x2, tol=10**(-8)):
    """Returns the results of floating point equality, according to a tolerance."""
    return abs(x1 - x2)<tol

会成功的。如果我没记错的话,确切的公差取决于float类型是单精度还是双精度,这取决于您使用的语言。

使用这样的函数可以让您轻松比较计算结果,例如在numpy. 让我们以以下示例为例,其中使用两种方法为具有连续变量的数据集计算相关矩阵:pandas方法pd.DataFrame.corr()numpy函数np.corrcoef()

import numpy as np
import seaborn as sns 

iris = sns.load_dataset('iris')
iris.drop('species', axis = 1, inplace=True)

# calculate correlation coefficient matrices using two different methods
cor1 = iris.corr().to_numpy()
cor2 = np.corrcoef(iris.transpose())

print(cor1)
print(cor2)

结果似乎相似:

[[ 1.         -0.11756978  0.87175378  0.81794113]
 [-0.11756978  1.         -0.4284401  -0.36612593]
 [ 0.87175378 -0.4284401   1.          0.96286543]
 [ 0.81794113 -0.36612593  0.96286543  1.        ]]
[[ 1.         -0.11756978  0.87175378  0.81794113]
 [-0.11756978  1.         -0.4284401  -0.36612593]
 [ 0.87175378 -0.4284401   1.          0.96286543]
 [ 0.81794113 -0.36612593  0.96286543  1.        ]]

但它们完全相等的结果却不是。这些运算符:

print(cor1 == cor2)
print(np.equal(cor1, cor2))

将主要False产生元素方面的结果:

[[ True False False False]
 [False False False False]
 [False False False False]
 [False False False  True]]

同样,np.array_equal(cor1, cor2)也会 yield False。但是,定制功能提供了您想要的比较:

out = [isequal_float(i,j) for i,j in zip(cor1.reshape(16, ), cor2.reshape(16, ))]
print(out)

[True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True]

注意: numpy包括.allclose()在 numpy 数组中执行浮点逐元素比较的功能。

print(np.allclose(cor1, cor2))
>>>True
于 2019-07-16T15:51:05.393 回答