python - 将两个二维数据集组合成具有共享比例的单二维直方图矩阵

Question

我有两个数据集d1,d2，其中填充了不同规模的 2D 数据。

import numpy as np
d1 = np.random.normal(-30,20,(500,2))
d2 = np.random.normal(-40,10,(500,2))

此外，我能够100 x 100从每个单独的数据集创建单独的 2D 直方图（=灰度图像）。

bins = [100,100]
h1 = np.histogram2d(d1[:,0], d1[:,1], bins)[0]
h2 = np.histogram2d(d2[:,0], d2[:,1], bins)[0]

但是使用此解决方案，每个 2D 直方图都以自己的平均值为中心，当将两个直方图绘制在彼此之上时，它们似乎分布在同一个中心周围，这实际上是不正确的。

我想要得到的是一个单一的100 x 100 x 2historgam Matrix（相当于 2 通道图像），它考虑了数据的不同比例，因此位移不会丢失。

score 1 · Accepted Answer

如果你传递histogram2d一个 value bins=[100, 100]，你要求它在每个维度上自动计算 100 个 bin。你可以自己做，所以这两个

bins = [
    np.linspace(x.min(), x.max(), 100),
    np.linspace(y.min(), y.max(), 100)
]
h1 = np.histogram2d(x, y, bins)

和

bins = [100, 100]
h1 = np.histogram2d(x, y, bins)

是等价的。

知道了这一点，我们现在可以计算两个数组组合的 bin 范围，并使用那些

bins = [
    np.linspace(
        min(d1[:, 0].min(), d2[:, 0].min()),
        max(d1[:, 0].max(), d2[:, 0].max()),
        100
    ),
    np.linspace(
        min(d1[:, 1].min(), d2[:, 1].min()),
        max(d1[:, 1].max(), d2[:, 1].max()),
        100
    )
]
h1 = np.histogram2d(d1[:,0], d1[:,1], bins)
h2 = np.histogram2d(d2[:,0], d2[:,1], bins)

或将两个数据集堆叠在一起并稍微简化代码

d = np.stack((d1, d2))

bins = [
    np.linspace(d[..., 0].min(), d[..., 0].max(), 100),
    np.linspace(d[..., 1].min(), d[..., 1].max(), 100),
]

h1 = np.histogram2d(d[0, :, 0], d[0, :, 1], bins)
h2 = np.histogram2d(d[1, :, 0], d[1, :, 1], bins)

python - 将两个二维数据集组合成具有共享比例的单二维直方图矩阵

1 回答 1

Related

Reference