python - Sklearn GMM 给出偏移的高斯峰

Question

我正在将两个高斯的混合拟合到一维数据（超过 1000 个点）。

似乎两个高斯和的峰值相对于直方图的峰值向左移动。我认为这是由于我的数据在 0.5 左右的截止值。

绿线和红线是两个最合适的高斯线，黑色是两者的总和。这是情节：

有什么方法可以确保峰值匹配，即使右侧缺少数据点？

我在用着：

    import numpy as np
    import matplotlib.pyplot as plt
    from sklearn import mixture
    import scipy.stats as stats

    g = mixture.GaussianMixture(n_components=2,covariance_type='full')
    g.fit(data)
    weights = g.weights_
    means = g.means_
    covars = g.covariances_

    num_bins = 50
    n, bins, patches = plt.hist(data, num_bins, normed=True, facecolor='blue', alpha=0.2)
    plt.plot(x,weights[0]*stats.norm.pdf(x,means[0],np.sqrt(covars[0])), c='red')
    plt.plot(x,weights[1]*stats.norm.pdf(x,means[1],np.sqrt(covars[1])), c='green')
    plt.plot(x, weights[0]*stats.norm.pdf(x,means[0],np.sqrt(covars[0])) + weights[1]*stats.norm.pdf(x,means[1],np.sqrt(covars[1])), c = 'black')

score 1 · Accepted Answer

您只是将绿色高斯添加到红色的总和中。由于两个高斯有很多重叠，如果你想让峰值匹配，你必须不要将绿色高斯添加到红色高斯，因为红色高斯正接近其峰值。

python - Sklearn GMM 给出偏移的高斯峰

1 回答 1

Related

Reference