0

在 Ubuntu 16.04.4 中,我根据“安装所需的 python 包”中的说明安装了 TensorFlow 1.3 ROCm 端口(用于 AMD Radeon RX Vega 64)

https://github.com/ROCmSoftwarePlatform/tensorflow/blob/rocm-v1/rocm_docs/tensorflow-install-basic.md

我之前根据中的说明从 AMD Debian 存储库安装了 ROCm

https://github.com/RadeonOpenCompute/ROCm

然后,使用 pip 安装没有虚拟化的 TF .whl 包:

$ wget http://repo.radeon.com/rocm/misc/tensorflow/tensorflow-1.3.0-cp27-cp27mu-manylinux1_x86_64.whl
$ sudo python -m pip install tensorflow-1.3.0-cp27-cp27mu-manylinux1_x86_64.whl

当我尝试使用验证安装时

$ python -c "import tensorflow as tf; print(tf.__version__)"

我收到以下错误:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import *
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 52, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 41, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
ImportError: libCXLActivityLogger.so: cannot open shared object file: No such file or directory

我验证了 _pywrap_tensorflow_internal.so 存在:

$ find / -name \*pywrap\* -ls 2>/dev/null
 27526810      4 -rw-r--r--   1 root     staff        2558 Jul 20 11:41 /usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py
 27526811      4 -rw-r--r--   1 root     staff        1312 Jul 20 11:41 /usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.pyc
 27526813     92 -rw-r--r--   1 root     staff       93912 Jul 20 11:41 /usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.pyc
 27526815 227172 -rwxr-xr-x   1 root     staff   232620600 Jul 20 11:41 /usr/local/lib/python2.7/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so
 27526816     72 -rw-r--r--   1 root     staff       70386 Jul 20 11:41 /usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py

还检查了我的轮子和点子版本:

$ pip list | grep wheel
wheel                        0.29.0
$ pip -V
pip 10.0.1 from ---- python2.7/site-packages/pip (python 2.7)

乍一看,好像没有设置某些环境变量,因此没有在正确的路径上搜索 _pywrap_tensorflow_internal.so。谁能告诉我是否是这种情况 - 或者问题的根源是否在其他地方?我做了一些搜索,结果基本上是空的。提前感谢您提供任何有用的回复。

4

1 回答 1

2

您的直接问题是libCXLActivityLogger.so缺少库文件,因为您可能从未安装过它。在 Ubuntu 上,假设您已经添加了 ROCm 软件存储库,您可以通过以下方式获取并安装此库:

$ sudo apt-get install cxlactivitylogger

但是,您可能会发现您缺少运行 TensorFlow 所需的更多库。当您按照https://github.com/RadeonOpenCompute/ROCm上的说明安装 ROCm 时,您只安装了“核心”ROCm 包,因此您缺少一些运行 ROCm TensorFlow 所需的额外 ROCm 软件组件(例如如 MIOpen、rocRAND、rocFFT 等)。要安装这些附加库,请按照您忽略的说明进行操作:

https://github.com/ROCmSoftwarePlatform/tensorflow/blob/rocm-v1/rocm_docs/tensorflow-install-basic.md

特别是这个命令将安装所有需要的 ROCm 包来运行 TensorFlow:

$ sudo apt-get update && \
      sudo apt-get install -y --allow-unauthenticated \
      rocm-dkms rocm-dev rocm-libs \
      rocm-device-libs \
      hsa-ext-rocr-dev hsakmt-roct-dev hsa-rocr-dev \
      rocm-opencl rocm-opencl-dev \
      rocm-utils \
      rocm-profiler cxlactivitylogger \
      miopen-hip miopengemm
于 2018-10-22T21:05:16.817 回答