我正在运行一个用 Python 实现并使用 NumPy 的算法。该算法中计算成本最高的部分涉及求解一组线性系统(即调用numpy.linalg.solve()
. 我想出了这个小基准:
import numpy as np
import time
# Create two large random matrices
a = np.random.randn(5000, 5000)
b = np.random.randn(5000, 5000)
t1 = time.time()
# That's the expensive call:
np.linalg.solve(a, b)
print time.time() - t1
我一直在运行这个:
- 我的笔记本电脑,2013 年末 MacBook Pro 15",4 核 2GHz(
sysctl -n machdep.cpu.brand_string
给我Intel(R) Core(TM) i7-4750HQ CPU @ 2.00GHz) - 具有 4 个 vCPU的 Amazon EC2
c3.xlarge
实例。亚马逊将它们宣传为“高频英特尔至强 E5-2680 v2(常春藤桥)处理器”
底线:
- 在 Mac 上运行时间约为 4.5 秒
- 在 EC2 实例上,它在~19.5 秒内运行
我也在其他基于 OpenBLAS / Intel MKL 的设置上尝试过,运行时总是与我在 EC2 实例上得到的相当(以硬件配置为模。)
谁能解释为什么 Mac(使用 Accelerate Framework)的性能要好 4 倍以上?下面提供了有关每个中的 NumPy / BLAS 设置的更多详细信息。
笔记本电脑设置
numpy.show_config()
给我:
atlas_threads_info:
NOT AVAILABLE
blas_opt_info:
extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
extra_compile_args = ['-msse3', '-I/System/Library/Frameworks/vecLib.framework/Headers']
define_macros = [('NO_ATLAS_INFO', 3)]
atlas_blas_threads_info:
NOT AVAILABLE
openblas_info:
NOT AVAILABLE
lapack_opt_info:
extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
extra_compile_args = ['-msse3']
define_macros = [('NO_ATLAS_INFO', 3)]
atlas_info:
NOT AVAILABLE
lapack_mkl_info:
NOT AVAILABLE
blas_mkl_info:
NOT AVAILABLE
atlas_blas_info:
NOT AVAILABLE
mkl_info:
NOT AVAILABLE
EC2 实例设置:
在 Ubuntu 14.04 上,我安装了 OpenBLAS
sudo apt-get install libopenblas-base libopenblas-dev
安装 NumPy 时,我创建了一个site.cfg
包含以下内容的文件:
[default]
library_dirs= /usr/lib/openblas-base
[atlas]
atlas_libs = openblas
numpy.show_config()
给我:
atlas_threads_info:
libraries = ['lapack', 'openblas']
library_dirs = ['/usr/lib']
define_macros = [('ATLAS_INFO', '"\\"None\\""')]
language = f77
include_dirs = ['/usr/include/atlas']
blas_opt_info:
libraries = ['openblas']
library_dirs = ['/usr/lib']
language = f77
openblas_info:
libraries = ['openblas']
library_dirs = ['/usr/lib']
language = f77
lapack_opt_info:
libraries = ['lapack', 'openblas']
library_dirs = ['/usr/lib']
define_macros = [('ATLAS_INFO', '"\\"None\\""')]
language = f77
include_dirs = ['/usr/include/atlas']
openblas_lapack_info:
NOT AVAILABLE
lapack_mkl_info:
NOT AVAILABLE
blas_mkl_info:
NOT AVAILABLE
mkl_info:
NOT AVAILABLE