tensorflow - 如何为 C API 激活 Tensorflow 的 XLA？

Question

我已经从源代码构建了 Tensorflow，并且正在使用它的 C API。到目前为止一切正常，我也在使用 AVX / AVX2。我从源代码构建的 Tensorflow 也是在 XLA 支持下构建的。我现在还想激活 XLA（加速线性代数），因为我希望它会再次提高推理过程中的性能/速度。

如果我现在开始跑步，我会收到以下消息：

2019-06-17 16:09:06.753737: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1541] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set.  If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU.  To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.

在官方 XLA 主页 ( https://www.tensorflow.org/xla/jit ) 上，我找到了有关如何在会话级别打开 jit 的信息：

# Config to turn on JIT compilation
config = tf.ConfigProto()
config.graph_options.optimizer_options.global_jit_level = tf.OptimizerOptions.ON_1

sess = tf.Session(config=config)

在这里（https://github.com/tensorflow/tensorflow/issues/13853）解释了如何在 C API 中设置 TF_SetConfig。在使用此 Python 代码的输出之前，我能够限制为一个核心：

config1 = tf.ConfigProto(device_count={'CPU':1})
serialized1 = config1.SerializeToString()
print(list(map(hex, serialized1)))

我实现它如下：

uint8_t intra_op_parallelism_threads = maxCores; // for operations that can be parallelized internally, such as matrix multiplication 
        uint8_t inter_op_parallelism_threads = maxCores; // for operations that are independent in your TensorFlow graph because there is no directed path between them in the dataflow graph
        uint8_t config[]={0x10,intra_op_parallelism_threads,0x28,inter_op_parallelism_threads};
        TF_SetConfig(sess_opts,config,sizeof(config),status);

因此，我认为这将有助于 XLA 激活：

config= tf.ConfigProto()
config.graph_options.optimizer_options.global_jit_level = tf.OptimizerOptions.ON_1
output = config.SerializeToString()
print(list(map(hex, output)))

这次的实现：

uint8_t config[]={0x52,0x4,0x1a,0x2,0x28,0x1};
        TF_SetConfig(sess_opts,config,sizeof(config),status);

但是 XLA 似乎仍然被停用。有人可以帮我解决这个问题吗？或者，如果您在警告中再次获得战利品：

2019-06-17 16:09:06.753737: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1541] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set.  If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU.  To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.

这是否意味着我必须在构建期间设置 XLA_FLAGS？

提前致谢！

score 2 · Accepted Answer

好的，我知道如何使用 XLA JIT，它仅在 c_api_experimental.h 标头中可用。只需包含此标头，然后使用：

TF_EnableXLACompilation(sess_opts,true);

score 1 · Accepted Answer

@tre95 我试过
#include "c_api_experimental.h" TF_SessionOptions* options = TF_NewSessionOptions(); TF_EnableXLACompilation(options,true);
了，但是编译失败，出现错误collect2: error: ld returned 1 exit status。但是，如果我不这样做，它可以编译并成功运行。

tensorflow - 如何为 C API 激活 Tensorflow 的 XLA？

2 回答 2

Related

Reference