我有 3D 数据 (x,y,z) 的随机样本,其中 x 和 y 是空间维度,z 是其在平面上的位置的函数:z = f(x,y)。我想使用样本中的插值在常规网格上评估此函数。
每组样本在不同的文件中,需要单独插值(每个文件是不同的时间点)。为了加快这些文件的处理速度,我想使用多处理 (mp) 模块使用 mp 池并行处理它们。
但是,当我尝试在 mp 池调用的函数中使用 mp.griddata 时,进程在到达 griddata 函数时会挂起。串行执行相同功能时没有问题。此外,这只发生在某些 numpy/scipy 构建中。当我的 numpy.show_config() 如下时:
blas_info:
libraries = ['blas']
library_dirs = ['/usr/lib']
language = f77
lapack_info:
libraries = ['lapack']
library_dirs = ['/usr/lib']
language = f77
atlas_threads_info:
NOT AVAILABLE
blas_opt_info:
libraries = ['blas']
library_dirs = ['/usr/lib']
language = f77
define_macros = [('NO_ATLAS_INFO', 1)]
atlas_blas_threads_info:
NOT AVAILABLE
lapack_opt_info:
libraries = ['lapack', 'blas']
library_dirs = ['/usr/lib']
language = f77
define_macros = [('NO_ATLAS_INFO', 1)]
atlas_info:
NOT AVAILABLE
lapack_mkl_info:
NOT AVAILABLE
blas_mkl_info:
NOT AVAILABLE
atlas_blas_info:
NOT AVAILABLE
mkl_info:
NOT AVAILABLE
一切正常。
但是当配置是:
lapack_info:
libraries = ['openblas']
library_dirs = ['/u/ccor/local/lib']
language = f77
atlas_threads_info:
libraries = ['openblas']
library_dirs = ['/u/ccor/local/lib']
define_macros = [('ATLAS_WITHOUT_LAPACK', None)]
language = c
include_dirs = ['/usr/include']
blas_opt_info:
libraries = ['openblas']
library_dirs = ['/u/ccor/local/lib']
define_macros = [('ATLAS_INFO', '"\\"None\\""')]
language = c
include_dirs = ['/usr/include']
atlas_blas_threads_info:
libraries = ['openblas']
library_dirs = ['/u/ccor/local/lib']
define_macros = [('ATLAS_INFO', '"\\"None\\""')]
language = c
include_dirs = ['/usr/include']
lapack_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/u/ccor/local/lib']
define_macros = [('ATLAS_WITHOUT_LAPACK', None)]
language = f77
include_dirs = ['/usr/include']
lapack_mkl_info:
NOT AVAILABLE
blas_mkl_info:
NOT AVAILABLE
mkl_info:
NOT AVAILABLE
子进程挂起。
第二个配置是通过手动编译 LibOpenBLAS 和 numpy 来尝试优化我的 BLAS 库以用于具有多个内核的机器。在这两种情况下,我都在使用 python 2.7.3 的 virtualenv 中。
此外,仅当 griddata 与“线性”或“三次”插值一起使用时才会发生挂起;“最近”工作正常。
这是一些试图说明和隔离问题的代码。数据是随机生成的,而不是从文件中读取的,结果只是作为示例任意累积的。
import multiprocessing as mp
import numpy
from scipy import interpolate
def interpolate_grid(it, n_points, grid_dim):
print 'interpolating: ', it
numpy.random.seed(it)
points = 2*numpy.random.random((n_points,2))-0.5
values = numpy.random.random(n_points)
x,y = numpy.mgrid[0:1:grid_dim*1j,0:1:grid_dim*1j]
grid_pnts = numpy.vstack([x.flatten(), y.flatten()]).T
interp_vals = interpolate.griddata(points, values, grid_pnts, 'cubic')
print 'done interpolating: ', it
return interp_vals
def interpolation_test(n_runs = 10, n_points = 1000, grid_dim = 10, parallel = True):
class Sum(object):
def __init__(self, grid_dim):
self.acc = numpy.zeros(grid_dim*grid_dim)
def accumulate(self, val):
self.acc += val
acc = Sum(grid_dim)
if parallel:
pool = mp.Pool(mp.cpu_count())
for i in xrange(n_runs):
pool.apply_async(interpolate_grid,
args = [i, n_points, grid_dim],
callback = acc.accumulate)
pool.close()
pool.join()
else:
for i in xrange(n_runs):
acc.acc += interpolate_grid(i, n_points, grid_dim)
return acc.acc
def main():
ser_result = interpolation_test(parallel = False)
par_result = interpolation_test(parallel = True)
assert (numpy.abs(par_result-ser_result) < 1e-14).all()
print 'serial and parallel results the same'
if __name__ == '__main__':
main()
编辑:我在scikit-learn 安装页面上找到了这个:“使用 OpenBLAS 可以在一些 scikit-learn 模块中提高速度,但它不能很好地与 joblib/multiprocessing 配合使用,所以除非你知道你是什么,否则不建议使用它”正在做。”
我通过使用 ATLAS 而不是 OpenBLAS 找到了一种解决方法:
aptitude install libatlas-base-dev
ATLAS=/usr/lib
pip install numpy
它不如 OpenBLAS 快,但可以与多处理一起使用,这对我的应用程序更重要,并且比基本的 numpy 安装要快得多。有谁“知道自己在做什么”的人愿意评论为什么 OpenBLAS 和多处理“不能很好地结合在一起”?或者如何让他们玩得好?scikit-learn 网站上的警告似乎暗示这是可能的。