0

我正在使用 V100 GPU 的 ppc64le Redhat 上试用 DAI 1.4.2,但我发现 dai-h2o 守护程序存在一些奇怪的错误。

似乎它无法初始化其 GPU 后端,我在 h2o.jar 中找到了用于 x86 架构(不是 ppc64le)的 libxgboost4j_gpu.so。

这是某种错误还是根本不重要?

[root@localhost home]# systemctl status dai-h2o
● dai-h2o.service - Driverless AI (H2O Process)
   Loaded: loaded (/usr/lib/systemd/system/dai-h2o.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/dai-h2o.service.d
       └─Group.conf, User.conf
   Active: active (running) since Mon 2018-12-17 14:51:23 KST; 1s ago
 Main PID: 80685 (java)
    Tasks: 93
   Memory: 155.8M
   CGroup: /system.slice/dai-h2o.service
           └─80685 
/opt/h2oai/dai/jre/bin/java -Xmx65536m -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Ddai.tmp=./tmp -jar /opt/h2oai/dai/h...

Dec 17 14:51:24 localhost.localdomain dai-env.sh[80685]: ======================================================================
 Dec 17 14:51:25 localhost.localdomain dai-env.sh[80685]: Cannot load library from path lib/linux_64/libxgboost4j_gpu.so
Dec 17 14:51:25 localhost.localdomain dai-env.sh[80685]: Cannot load library from path lib/libxgboost4j_gpu.so
Dec 17 14:51:25 localhost.localdomain dai-env.sh[80685]: Failed to load library from both native path and jar!
Dec 17 14:51:25 localhost.localdomain dai-env.sh[80685]: Cannot load library from path lib/linux_64/libxgboost4j_omp.so
Dec 17 14:51:25 localhost.localdomain dai-env.sh[80685]: Cannot load library from path lib/libxgboost4j_omp.so
Dec 17 14:51:25 localhost.localdomain dai-env.sh[80685]: Failed to load library from both native path and jar!
Dec 17 14:51:25 localhost.localdomain dai-env.sh[80685]: Cannot load library from path lib/linux_64/libxgboost4j_minimal.so
Dec 17 14:51:25 localhost.localdomain dai-env.sh[80685]: Cannot load library from path lib/libxgboost4j_minimal.so
Dec 17 14:51:25 localhost.localdomain dai-env.sh[80685]: Failed to load library from both native path and jar!

[root@localhost home]# netstat -an | grep 12345

[root@localhost home]# ls -l 
/opt/h2oai/dai/h2o.jar
-rw-r--r-- 1 root root 109623422 Dec  4 07:45 /opt/h2oai/dai/h2o.jar

[root@localhost home]# jar -xvf /opt/h2oai/dai/h2o.jar lib/linux_64/libxgboost4j_gpu.so
 inflated: lib/linux_64/libxgboost4j_gpu.so

[root@localhost home]# ls -l lib/linux_64/libxgboost4j_gpu.so
-rw-r--r-- 1 root root 34754400 Jul  8 12:56 lib/linux_64/libxgboost4j_gpu.so

[root@localhost home]# file lib/linux_64/libxgboost4j_gpu.so
lib/linux_64/libxgboost4j_gpu.so: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=c7c3682ccf33d3d0395772e924be1e416a60a2c4, not stripped
4

1 回答 1

2

这一点都不重要。

与 Driverless AI 捆绑在一起的 h2o.jar 根本不适用于 GPU。

所有 GPU 的使用都来自 Driverless AI python 进程。

于 2018-12-17T06:55:22.327 回答