我正在拟合一些数据以从对数正态分布的总和中获取参数。我正在使用石榴混合物模型。首先,生成了 3 个已知 mu 和 sigma 的种群。然后,我运行我的程序,看看石榴是否可以重现这些参数。但是,不是那么准确:
from pomegranate import *
# Generation of 3 population:
import numpy as np
s1 = np.random.lognormal(2, 0.6, size = (1000, 1))
s2 = np.random.lognormal(1.8, 0.3, size = (1000, 1))
s3 = np.random.lognormal(1.6, 0.7, size = (1000, 1))
#universe, with all mixed populations
S = np.concatenate((s1, s2, s3), axis=0)
#GMM
model = GeneralMixtureModel.from_samples([LogNormalDistribution], 3, S)
print(model.distributions[0].parameters[0])
print(model.distributions[1].parameters[0])
print(model.distributions[2].parameters[0])
每个种群的预期 mu 和 sigma:
s1 = (2, 0.6)
s2 = (1.8, 0.3)
s3 = (1.6, 0.7)
实际输出:
[{
"class" : "Distribution",
"name" : "LogNormalDistribution",
"parameters" : [
1.845882204858477,
0.3306239407136521
],
"frozen" : false
}]
[{
"class" : "Distribution",
"name" : "LogNormalDistribution",
"parameters" : [
2.8607217186274694,
0.4139617889468684
],
"frozen" : false
}]
[{
"class" : "Distribution",
"name" : "LogNormalDistribution",
"parameters" : [
1.6632006730938589,
0.679604917128916
],
"frozen" : false
}]
我的问题是如何使这个结果更准确?