11

I am using Python library scipy to calculate Pearson's correlation for two float arrays. The returned value for coefficient is always 1.0, even if the arrays are different. For example:

[-0.65499887  2.34644428]
[-1.46049758  3.86537321]

I am calling the routine in this way:

r_row, p_value = scipy.stats.pearsonr(array1, array2)

The value of r_row is always 1.0. What am I doing wrong?

4

2 回答 2

22

Pearson 的相关系数是衡量您的数据与线性回归拟合的程度。如果你只提供两个点,那么有一条线正好穿过这两个点,因此你的数据完全符合一条线,因此相关系数正好是 1。

于 2013-04-17T15:47:44.350 回答
6

我认为 pearson 相关系数总是返回1.0或者-1.0如果每个数组只有两个元素,因为你总是可以通过两个点画一条完美的直线。尝试使用长度为 3 的数组,它会起作用:

import scipy
from scipy.stats import pearsonr

x = scipy.array([-0.65499887,  2.34644428, 3.0])
y = scipy.array([-1.46049758,  3.86537321, 21.0])

r_row, p_value = pearsonr(x, y)

结果:

>>> r_row
0.79617014831975552
>>> p_value
0.41371200873701036
于 2013-04-17T15:24:15.700 回答