python - 有没有一种聪明的方法来并行化 ndarray 上的复杂函数？

Question

有很多可能性python可以提高您的代码性能（例如广播，像numba这样的包。但据我所知，这些方法依赖于基本的代码，即使用例如numpy.ndarray或函数numpy.linalg。

在我的特殊情况下，我使用statsmodels ThetaModel来预测（很多！）时间序列，这些时间序列被分组在一个ndarray.

有没有什么聪明的方法可以提高代码性能/并行化代码？

目前我正在使用列表理解。

（简化）工作示例

import numpy as np
from statsmodels.tsa.forecasting.theta import ThetaModel

def thetaForecast(series):
    model = ThetaModel(series, period=50, deseasonalize=True, use_test=False).fit()
    forecast = model.forecast(steps=len(series))
    return forecast
    
data = np.random.randn(500,10) # 10 time series each of length 500 (dimensions reduced here for simplification)
dataForecast = np.array([thetaForecast(col) for col in data.transpose()])

以防万一它发挥作用，thetaForecast与这个稍微简化的版本相比，我的函数实际上需要多个参数。

PS：我不是经验丰富的 stackoverflow 用户。欢迎提出如何改进我的问题的提示:)

score 0 · Accepted Answer

您是否尝试过使用多处理？除非 ThetaModel.forecast() 发布 GIL（如果它是用 C 或 Fortran 实现的，它可以），多处理是您可以并行化它的主要方式。

或者，您当然可以在 Numba、C、C++ 或 Fortran 中自己重新实现 forecast() 并自己发布 GIL——然后您可以在单个进程中使用多个线程。

python - 有没有一种聪明的方法来并行化 ndarray 上的复杂函数？

（简化）工作示例

1 回答 1

Related

Reference