python - Python中的Spearman等级相关性与关系

Question

我想使用 Python 和最有可能的 scipy 实现（scipy.stats.spearmanr）计算spearman 等级相关性。

手头的数据看起来如下（字典）：

{a:0.3, b:0.2, c:0.2} and {a:0.5, b:0.6, c:0.4}

现在将它传递给 spearman 模块，如果我是正确的（降序），我会给他们分配排名：

[1,2,3] and [2,1,3]

所以现在我想考虑关系，所以我现在将用于第一个向量：

[1,2,2] or [1,2.5,2.5]

基本上，这整个概念是否正确以及如何处理此类基于字典的数据的关系。

正如@Jaime 所建议的， spearmanr 函数适用于值，但为什么会出现这种行为：

In [5]: spearmanr([0,1,2,3],[1,3,2,0])
Out[5]: (-0.39999999999999997, 0.59999999999999998)

In [6]: spearmanr([10,7,6,5],[0.9,0.5,0.6,1.0])
Out[6]: (-0.39999999999999997, 0.59999999999999998)

谢谢！

score 12 · Accepted Answer

scipy.stats.spearmanr将为您计算排名，您只需以正确的顺序为其提供数据：

>>> scipy.stats.spearmanr([0.3, 0.2, 0.2], [0.5, 0.6, 0.4])
(0.0, 1.0)

如果你有排名数据，你可以调用scipy.stats.pearsonr它来获得相同的结果。正如下面的示例所示，您尝试过的任何一种方式都可以使用，尽管我认为[1, 2.5, 2.5]更常见。此外，scipy 使用从零开始的索引，因此内部使用的排名将更像[0, 1.5, 1.5]：

>>> scipy.stats.pearsonr([1, 2, 2], [2, 1, 3])
(0.0, 1.0)
>>> scipy.stats.pearsonr([1, 2.5, 2.5], [2, 1, 3])
(0.0, 1.0)

python - Python中的Spearman等级相关性与关系

1 回答 1

Related

Reference