我正在尝试使用 pymc 来解决一个简单的模型:
我有 N=1000 个通量,我知道这些通量来自帕累托分布:通量 ~ Pareto(alpha, 1)
我正在尝试计算 Pareto 的 alpha 参数: alpha ~ Uniform(1, 3)
我的通量测量值被高斯噪声污染:flux_meas ~ Gaussian(flux, tau)
目前,我只是试图模拟如果我改变噪音量会发生什么。
问题是,当我以非常小的(即可忽略的)噪声量运行模型时,每次运行时 alpha 的平均值都会发生根本变化,并且似乎与 alpha 的真实值根本无关。但是,如果我完全忽略高斯噪声,并说我只是直接观察帕累托分布,它会按预期工作。
我究竟做错了什么?
一个工作代码片段在这里:(一个更复杂,更旧的代码片段在这里)
关键位如下:
import pymc
N = 1000
true_alpha = 2
noise = 0.001 # This noise is much smaller than the signal
# Simulated fluxes
s_arr = pymc.rpareto(true_alpha, 1, size=N)
# the unknown alpha
alpha = pymc.Uniform('alpha', 1, 3)
# fluxes are drawn from a Pareto distribution
flux = pymc.Pareto('flux', alpha, 1, size=N)
# My observed fluxes are contaminated by Gaussian noise
flux_meas = pymc.Normal('flux_meas', mu=flux, tau=noise**-2, observed=True,
value=pymc.rnormal(s_arr, tau=noise**-2, size=N))
model = pymc.MCMC([alpha, flux, flux_meas])
# If I run this model several times, the mean of alpha will be somewhere between
# 1 and 3. The variance of alpha is pretty small
model.sample(5000, 1000, 5)