0

我正在尝试使用 Spyder 在 Win10 环境中运行带有 pymc3 的分层模型。我有一些全局模型参数(theta、omega、sigma)和一个特定参数(Ci)。

它需要一个 pd Dataframe 作为包含相关数据的输入。第一列称为“队列”,第二列称为“时期”,第三列包含观察结果。

观察的数量在队列之间有所不同。

代码如下所示:

import pymc3 as pm
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import theano.tensor as tt 

cohorts_idx, cohorts = pd.factorize(inputs['Cohort'], sort = True)
periods_idx, periods = pd.factorize(inputs['Period'], sort = True)

coords = {
    "cohort": cohorts,
    "period": periods,
    "collections": np.arange(len(cohorts_idx))
    }

with pm.Model(coords = coords) as model:
    
    # global model parameters
    omega = pm.HalfNormal("omega", sigma = 3)
    theta = pm.HalfNormal("theta", sigma = 5)
    sigma = pm.HalfNormal("sigma", sigma = 20)
            
    # cohort specific parameter
    Ci = pm.TruncatedNormal("Ci", mu = 60, sigma = 10, lower = 10, upper = 110, dims = "cohort")
    

    mu_i_t = Ci[cohorts_idx] * (1 - tt.exp(- (periods[periods_idx] / theta) ** omega))
    sigma_i_t = sigma * mu_i_t ** 0.5
    
    _ = pm.Normal("Collections_i_t",
                                mu = mu_i_t,
                                sigma = sigma_i_t,            
                                observed = inputs['Collections'],
                                dims = "collections")
         
    results = pm.sample(draws = 1000, tune = 1000, cores = 8)
 
    return pm.summary(results)

产生的错误信息是:

mu_i_t = Ci[cohorts_idx] * (1 - tt.exp(- (periods[periods_idx] / theta) ** omega))

File "C:\Users\alexi\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 139, in 
index_arithmetic_method
result = op(Series(self), other)

File "C:\Users\alexi\Anaconda3\lib\site-packages\pandas\core\ops\common.py", line 64, in new_method
return method(self, other)

File "C:\Users\alexi\Anaconda3\lib\site-packages\pandas\core\ops\__init__.py", line 505, in wrapper
return _construct_result(left, result, index=left.index, name=res_name)

File "C:\Users\alexi\Anaconda3\lib\site-packages\pandas\core\ops\__init__.py", line 478, in _ 
construct_result
out = left._constructor(result, index=index)

File "C:\Users\alexi\Anaconda3\lib\site-packages\pandas\core\series.py", line 279, in __init__
data = com.maybe_iterable_to_list(data)

File "C:\Users\alexi\Anaconda3\lib\site-packages\pandas\core\common.py", line 280, in 
maybe_iterable_to_list
return list(obj)

File "C:\Users\alexi\Anaconda3\lib\site-packages\theano\tensor\var.py", line 640, in __iter__
for i in xrange(theano.tensor.basic.get_vector_length(self)):

File "C:\Users\alexi\Anaconda3\lib\site-packages\theano\tensor\basic.py", line 4828, in 
get_vector_length
raise ValueError("length not known: %s" % msg)

ValueError: length not known: Elemwise{true_div,no_inplace} [id A] ''   
|TensorConstant{[ 1  2  3 ..  1  2  1]} [id B]
|InplaceDimShuffle{x} [id C] ''   
|ViewOp [id D] 'theta'   
 |Elemwise{exp,no_inplace} [id E] ''   
   |theta_log__ [id F]

我不知道为什么。请注意,如果不是在导致错误的行中除以 Theta,而是进行加法,那么它可以工作(但显然这不是我想要的)。

我该如何解决这个问题并让这个部门工作?

4

1 回答 1

0

好的,我发现了。我不足以解释原因,但以下工作。需要更换线路:

mu_i_t = Ci[cohorts_idx] * (1 - tt.exp(- (periods[periods_idx] / theta) ** omega))

经过:

mu_i_t = Ci[cohorts_idx] * (1 - tt.exp(- (periods[periods_idx] .to_numpy() / theta) ** omega))

于 2020-10-31T15:01:09.877 回答