machine-learning - Tensorflow Bazel 0.3.0 构建 CUDA 8.0 GTX 1070 失败

Question

这是我的规格：

GTX 1070
驱动程序 367（从 .run 安装）
Ubuntu 16.04
CUDA 8.0（从 .run 安装）
库登 5
Bazel 0.3.0（潜在问题？）
GCC 4.9.3
从源代码安装的 TensorFlow

要验证版本：

volcart@volcart-Precision-Tower-7910:~/$ nvidia-smi
Fri Aug  5 15:03:32 2016       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.35                 Driver Version: 367.35                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 0000:03:00.0      On |                  N/A |
|  0%   38C    P8    11W / 185W |    495MiB /  8113MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0     20303    G   /usr/lib/xorg/Xorg                             280MiB |
|    0     20909    G   compiz                                         114MiB |
|    0     21562    G   ...s-passed-by-fd --v8-snapshot-passed-by-fd    98MiB |
+-----------------------------------------------------------------------------+
volcart@volcart-Precision-Tower-7910:~/$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Wed_May__4_21:01:56_CDT_2016
Cuda compilation tools, release 8.0, V8.0.26
volcart@volcart-Precision-Tower-7910:~/$ bazel version
Build label: 0.3.0
Build target: bazel-out/local-fastbuild/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Fri Jun 10 11:38:23 2016 (1465558703)
Build timestamp: 1465558703
Build timestamp as int: 1465558703
volcart@volcart-Precision-Tower-7910:~/$ gcc -vUsing built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.9/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.9.3-13ubuntu2' --with-bugurl=file:///usr/share/doc/gcc-4.9/README.Bugs --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.9 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.9 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.9-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.9-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.9-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.9.3 (Ubuntu 4.9.3-13ubuntu2)

我确实切换了 bazel 版本，所以我bazel clean成功执行了。

我可以通过以下方式验证 CUDA 是否正常运行~/NVIDIA_CUDA-8.0_Samples/1_Utilities/deviceQuery$

volcart@volcart-Precision-Tower-7910:~/NVIDIA_CUDA-8.0_Samples/1_Utilities/deviceQuery$ ./deviceQuery 
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1070"
  CUDA Driver Version / Runtime Version          8.0 / 8.0
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 8113 MBytes (8507162624 bytes)
  (15) Multiprocessors, (128) CUDA Cores/MP:     1920 CUDA Cores
  GPU Max Clock rate:                            1797 MHz (1.80 GHz)
  Memory Clock rate:                             4004 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 2097152 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 3 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GTX 1070
Result = PASS

当我./configure输入所有默认值时。

当前的错误

当我构建训练示例时，我得到了这个：

volcart@volcart-Precision-Tower-7910:/usr/local/lib/python2.7/dist-packages/tensorflow$ sudo bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
Sending SIGTERM to previous Bazel server (pid=7108)... done.
.
INFO: Found 1 target...

...

./tensorflow/core/platform/default/logging.h: In instantiation of 'std::string* tensorflow::internal::Check_LTImpl(const T1&, const T2&, const char*) [with T1 = int; T2 = long unsigned int; std::string = std::basic_string<char>]':
tensorflow/core/common_runtime/gpu/gpu_device.cc:567:5:   required from here
./tensorflow/core/platform/default/logging.h:197:35: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
 TF_DEFINE_CHECK_OP_IMPL(Check_LT, < )
                                   ^
./tensorflow/core/platform/macros.h:54:29: note: in definition of macro 'TF_PREDICT_TRUE'
 #define TF_PREDICT_TRUE(x) (x)
                             ^
./tensorflow/core/platform/default/logging.h:197:1: note: in expansion of macro 'TF_DEFINE_CHECK_OP_IMPL'
 TF_DEFINE_CHECK_OP_IMPL(Check_LT, < )
 ^
ERROR: /usr/local/lib/python2.7/dist-packages/tensorflow/tensorflow/cc/BUILD:199:1: Linking of rule '//tensorflow/cc:tutorials_example_trainer' failed: crosstool_wrapper_driver_is_not_gcc failed: error executing command third_party/gpus/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -o bazel-out/local_linux-opt/bin/tensorflow/cc/tutorials_example_trainer ... (remaining 805 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
bazel-out/local_linux-opt/bin/tensorflow/cc/_objs/tutorials_example_trainer/tensorflow/cc/tutorials/example_trainer.o: In function `tensorflow::example::ConcurrentSteps(tensorflow::example::Options const*, int)':
example_trainer.cc:(.text._ZN10tensorflow7example15ConcurrentStepsEPKNS0_7OptionsEi+0x517): undefined reference to `google::protobuf::internal::empty_string_'
bazel-out/local_linux-opt/bin/tensorflow/core/kernels/libidentity_reader_op.lo(identity_reader_op.o): In function `tensorflow::IdentityReader::SerializeStateLocked(std::string*)':
identity_reader_op.cc:(.text._ZN10tensorflow14IdentityReader20SerializeStateLockedEPSs[_ZN10tensorflow14IdentityReader20SerializeStateLockedEPSs]+0x36): undefined reference to `google::protobuf::MessageLite::SerializeToString(std::string*) const'
bazel-out/local_linux-opt/bin/tensorflow/core/kernels/libwhole_file_read_ops.lo(whole_file_read_ops.o): In function `tensorflow::WholeFileReader::SerializeStateLocked(std::string*)':

当我尝试构建pip包时，我得到了这个：

volcart@volcart-Precision-Tower-7910:/usr/local/lib/python2.7/dist-packages/tensorflow$ sudo bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
WARNING: /usr/local/lib/python2.7/dist-packages/tensorflow/util/python/BUILD:11:16: in includes attribute of cc_library rule //util/python:python_headers: 'python_include' resolves to 'util/python/python_include' not in 'third_party'. This will be an error in the future.
WARNING: /home/volcart/.cache/bazel/_bazel_root/109ad80a732aaece8a87d1e3693889e7/external/gemmlowp/BUILD:102:12: in hdrs attribute of cc_library rule @gemmlowp//:eight_bit_int_gemm: Artifact 'external/gemmlowp/public/bit_depth.h' is duplicated (through '@gemmlowp//:eight_bit_int_gemm_public_headers' and '@gemmlowp//:gemmlowp_headers').
WARNING: /home/volcart/.cache/bazel/_bazel_root/109ad80a732aaece8a87d1e3693889e7/external/gemmlowp/BUILD:102:12: in hdrs attribute of cc_library rule @gemmlowp//:eight_bit_int_gemm: Artifact 'external/gemmlowp/public/gemmlowp.h' is duplicated (through '@gemmlowp//:eight_bit_int_gemm_public_headers' and '@gemmlowp//:gemmlowp_headers').
WARNING: /home/volcart/.cache/bazel/_bazel_root/109ad80a732aaece8a87d1e3693889e7/external/gemmlowp/BUILD:102:12: in hdrs attribute of cc_library rule @gemmlowp//:eight_bit_int_gemm: Artifact 'external/gemmlowp/public/map.h' is duplicated (through '@gemmlowp//:eight_bit_int_gemm_public_headers' and '@gemmlowp//:gemmlowp_headers').
WARNING: /home/volcart/.cache/bazel/_bazel_root/109ad80a732aaece8a87d1e3693889e7/external/gemmlowp/BUILD:102:12: in hdrs attribute of cc_library rule @gemmlowp//:eight_bit_int_gemm: Artifact 'external/gemmlowp/public/output_stages.h' is duplicated (through '@gemmlowp//:eight_bit_int_gemm_public_headers' and '@gemmlowp//:gemmlowp_headers').
WARNING: /home/volcart/.cache/bazel/_bazel_root/109ad80a732aaece8a87d1e3693889e7/external/gemmlowp/BUILD:102:12: in hdrs attribute of cc_library rule @gemmlowp//:eight_bit_int_gemm: Artifact 'external/gemmlowp/profiling/instrumentation.h' is duplicated (through '@gemmlowp//:eight_bit_int_gemm_public_headers' and '@gemmlowp//:gemmlowp_headers').
WARNING: /home/volcart/.cache/bazel/_bazel_root/109ad80a732aaece8a87d1e3693889e7/external/gemmlowp/BUILD:102:12: in hdrs attribute of cc_library rule @gemmlowp//:eight_bit_int_gemm: Artifact 'external/gemmlowp/profiling/profiler.h' is duplicated (through '@gemmlowp//:eight_bit_int_gemm_public_headers' and '@gemmlowp//:gemmlowp_headers').
INFO: Found 1 target...
INFO: From Compiling external/protobuf/src/google/protobuf/util/internal/utility.cc [for host]:


...

INFO: From Compiling tensorflow/core/distributed_runtime/tensor_coding.cc:
tensorflow/core/distributed_runtime/tensor_coding.cc: In member function 'bool tensorflow::TensorResponse::ParseTensorSubmessage(google::protobuf::io::CodedInputStream*, tensorflow::TensorProto*)':
tensorflow/core/distributed_runtime/tensor_coding.cc:123:23: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
         if (num_bytes != buf.size()) return false;
                       ^
ERROR: /usr/local/lib/python2.7/dist-packages/tensorflow/tensorflow/core/kernels/BUILD:1498:1: undeclared inclusion(s) in rule '//tensorflow/core/kernels:batchtospace_op_gpu':
this rule is missing dependency declarations for the following files included by 'tensorflow/core/kernels/batchtospace_op_gpu.cu.cc':
  '/usr/local/cuda-8.0/include/cuda_runtime.h'
  '/usr/local/cuda-8.0/include/host_config.h'
  '/usr/local/cuda-8.0/include/builtin_types.h'
  '/usr/local/cuda-8.0/include/device_types.h'
  '/usr/local/cuda-8.0/include/host_defines.h'
  '/usr/local/cuda-8.0/include/driver_types.h'
  '/usr/local/cuda-8.0/include/surface_types.h'
  '/usr/local/cuda-8.0/include/texture_types.h'
  '/usr/local/cuda-8.0/include/vector_types.h'
  '/usr/local/cuda-8.0/include/library_types.h'
  '/usr/local/cuda-8.0/include/channel_descriptor.h'
  '/usr/local/cuda-8.0/include/cuda_runtime_api.h'
  '/usr/local/cuda-8.0/include/cuda_device_runtime_api.h'
  '/usr/local/cuda-8.0/include/driver_functions.h'
  '/usr/local/cuda-8.0/include/vector_functions.h'
  '/usr/local/cuda-8.0/include/vector_functions.hpp'
  '/usr/local/cuda-8.0/include/common_functions.h'
  '/usr/local/cuda-8.0/include/math_functions.h'
  '/usr/local/cuda-8.0/include/math_functions.hpp'
  '/usr/local/cuda-8.0/include/math_functions_dbl_ptx3.h'
  '/usr/local/cuda-8.0/include/math_functions_dbl_ptx3.hpp'
  '/usr/local/cuda-8.0/include/cuda_surface_types.h'
  '/usr/local/cuda-8.0/include/cuda_texture_types.h'
  '/usr/local/cuda-8.0/include/device_functions.h'
  '/usr/local/cuda-8.0/include/device_functions.hpp'
  '/usr/local/cuda-8.0/include/device_atomic_functions.h'
  '/usr/local/cuda-8.0/include/device_atomic_functions.hpp'
  '/usr/local/cuda-8.0/include/device_double_functions.h'
  '/usr/local/cuda-8.0/include/device_double_functions.hpp'
  '/usr/local/cuda-8.0/include/sm_20_atomic_functions.h'
  '/usr/local/cuda-8.0/include/sm_20_atomic_functions.hpp'
  '/usr/local/cuda-8.0/include/sm_32_atomic_functions.h'
  '/usr/local/cuda-8.0/include/sm_32_atomic_functions.hpp'
  '/usr/local/cuda-8.0/include/sm_35_atomic_functions.h'
  '/usr/local/cuda-8.0/include/sm_60_atomic_functions.h'
  '/usr/local/cuda-8.0/include/sm_60_atomic_functions.hpp'
  '/usr/local/cuda-8.0/include/sm_20_intrinsics.h'
  '/usr/local/cuda-8.0/include/sm_20_intrinsics.hpp'
  '/usr/local/cuda-8.0/include/sm_30_intrinsics.h'
  '/usr/local/cuda-8.0/include/sm_30_intrinsics.hpp'
  '/usr/local/cuda-8.0/include/sm_32_intrinsics.h'
  '/usr/local/cuda-8.0/include/sm_32_intrinsics.hpp'
  '/usr/local/cuda-8.0/include/sm_35_intrinsics.h'
  '/usr/local/cuda-8.0/include/surface_functions.h'
  '/usr/local/cuda-8.0/include/texture_fetch_functions.h'
  '/usr/local/cuda-8.0/include/texture_indirect_functions.h'
  '/usr/local/cuda-8.0/include/surface_indirect_functions.h'
  '/usr/local/cuda-8.0/include/device_launch_parameters.h'
  '/usr/local/cuda-8.0/include/cuda_fp16.h'
  '/usr/local/cuda-8.0/include/math_constants.h'
  '/usr/local/cuda-8.0/include/curand_kernel.h'
  '/usr/local/cuda-8.0/include/curand.h'
  '/usr/local/cuda-8.0/include/curand_discrete.h'
  '/usr/local/cuda-8.0/include/curand_precalc.h'
  '/usr/local/cuda-8.0/include/curand_mrg32k3a.h'
  '/usr/local/cuda-8.0/include/curand_mtgp32_kernel.h'
  '/usr/local/cuda-8.0/include/cuda.h'
  '/usr/local/cuda-8.0/include/curand_mtgp32.h'
  '/usr/local/cuda-8.0/include/curand_philox4x32_x.h'
  '/usr/local/cuda-8.0/include/curand_globals.h'
  '/usr/local/cuda-8.0/include/curand_uniform.h'
  '/usr/local/cuda-8.0/include/curand_normal.h'
  '/usr/local/cuda-8.0/include/curand_normal_static.h'
  '/usr/local/cuda-8.0/include/curand_lognormal.h'
  '/usr/local/cuda-8.0/include/curand_poisson.h'
  '/usr/local/cuda-8.0/include/curand_discrete2.h'.
nvcc warning : option '--relaxed-constexpr' has been deprecated and replaced by option '--expt-relaxed-constexpr'.
nvcc warning : option '--relaxed-constexpr' has been deprecated and replaced by option '--expt-relaxed-constexpr'.
Target //tensorflow/tools/pip_package:build_pip_package failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 138.913s, Critical Path: 102.63s

score 2 · Accepted Answer

我看到有人抱怨 bazel 0.3.1，可能需要降级到 0.3.0。您给出的错误信息不是很丰富，这只是父脚本说子脚本失败，控制台上应该有更多信息以及实际错误。

两天前，我完成了 GTX 1080 的设置步骤，它与这个配置一起工作。

Ubuntu 16.04
Nvidia Driver: nvidia-367.35 (installed from .run file)
Bazel 0.3.0
gcc: 4.9.3 (default with 16.04)
CUDA 8.0.27 (installed from .run file into default dirs)
compute capability: (use default values for config)

machine-learning - Tensorflow Bazel 0.3.0 构建 CUDA 8.0 GTX 1070 失败

1 回答 1

Related

Reference