python - matplotlib 的 histogramdd 是如何工作的？

Question

我发现 histogramdd 的输出令人困惑。例如：

h, edges = histogramdd([[1,2,1],[4,2,1]],bins=2)

h -> [[ 1.  1.]
     [ 1.  0.]]
edges -> [array([ 1. ,  1.5,  2. ]), array([ 1. ,  2.5,  4. ])]

也许我不理解文档，但似乎建议输入应该是一个数组，其中N行代表数据点，D列代表维度（因此在这种情况下，我们正在处理三个维度中的两个数据点），我猜中的每个数组edges代表一个不同的维度，但根据输出似乎没有意义h。

这应该如何解释？

谢谢

score 9 · Accepted Answer

更新

上次我错了。现在这是对 histogramdd 的正确解释。首先，在 histogramdd 中使用数组非常重要，否则会输出虚假结果：

比较一下：

In [59]: h, edges = histogramdd([[1,2,4],[4,2,8],[3,2,1],[2,1,2],[2,1,3],[2,1,1],[2,1,4]],bins=3)
h.shape
Out[59]: (3, 3, 3, 3, 3, 3, 3)

对此：

In [60]: h, edges = histogramdd(array([[1,2,4],[4,2,8],[3,2,1],[2,1,2],[2,1,3],[2,1,1],[2,1,4]]),bins=3)
h.shape
Out[60]: (3, 3, 3)

使用第二种方法，我们得到了合理的结果：

In [61]: h, edges = histogramdd(array([[1,2,4],[4,2,8],[3,2,1],[2,1,2],[2,1,3],[2,1,1],[2,1,4]]),bins=3)
In [64]: h
Out[64]:
array([[[ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  1.,  0.]],

       [[ 3.,  1.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]],

       [[ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 1.,  0.,  1.]]])
In [65]: edges
Out[65]:
[array([ 1.,  2.,  3.,  4.]),
 array([ 1.        ,  1.33333333,  1.66666667,  2.        ]),
 array([ 1.        ,  3.33333333,  5.66666667,  8.        ])]

我们的输入是[1,2,4], [4,2,8], etc. 边缘表示每个维度的箱。在这个例子中，[1,2,4]计数如下： 1 属于第一个 bin，array([1.,2.,3.,4.])因为它在 1 和 2 之间，2 属于第三个 bin，array([ 1. , 1.33333333, 1.66666667, 2. ])因为它在 1.6666667 和 2 之间，4 属于第二个 bin，array([ 1. , 3.33333333, 5.66666667, 8. ])因为它在 3.33333333 和 5.66666667 之间. 所以我们有第一个 bin，第三个 bin 和第二个 bin 作为 point 的坐标[1,2,4]。这意味着我们正在计算第一个数组第三行第二列中的元素：

[[ 0.,  0.,  0.],
[ 0.,  0.,  0.],
[ 0.,  1*.,  0.]]

我添加了一个 * 让您更容易识别它。第二个坐标[4,2,8]分别位于 x、y、z 的第三个 bin、第三个 bin 和第三个 bin 中（第三个数组，第三行，第三列）：

[[ 0.,  0.,  0.],
[ 0.,  0.,  0.],
[ 1.,  0.,  1.*]]])

作为最后一个示例，第三个坐标[3,2,1]分别位于 x、y、z 的第三个 bin、第三个 bin 和第一个 bin 中（第三个数组，第三行，第一列）：

[[ 0.,  0.,  0.],
 [ 0.,  0.,  0.],
 [ 1.*,  0.,  1.]]

python - matplotlib 的 histogramdd 是如何工作的？

1 回答 1

Related

Reference