1

我已按照Cloud TPU Tools上的说明进行操作。除了您必须将 --tpu_name 更改为 --tpu 的第 4 步之外,一切似乎都按预期工作。

失败的是“配置文件”选项卡的生成。我执行了

capture_tpu_profile --tpu_name=$TPU_NAME --logdir=${model_dir}

产生了

Welcome to the Cloud TPU Profiler v1.6.0
Starting to profile TPU traces for 2000 ms. Remaining attempt(s): 3
Limiting the number of trace events to 1000000
Profile session succeed for host(s):10.240.1.2

我多次刷新/重新启动 TensorBoard,但没有“配置文件”选项卡,从下拉菜单中单击“配置文件”不会返回任何数据。

这是 Cloud TPU 分析器的已知问题吗?

--编辑1--

Profiler v 1.5.2 无法收集跟踪事件。

Welcome to the Cloud TPU Profiler v1.5.2
Starting to profile TPU traces for 2000 ms. Remaining attempt(s): 3
Limiting the number of trace events to 1000000
No trace event is collected. Automatically retrying.

Starting to profile TPU traces for 2000 ms. Remaining attempt(s): 2
Limiting the number of trace events to 1000000
No trace event is collected. Automatically retrying.

Starting to profile TPU traces for 2000 ms. Remaining attempt(s): 1
Limiting the number of trace events to 1000000
No trace event is collected after 3 attempt(s). Perhaps, you want to try again (with more attempts?).
Tip: increase number of attempts with --num_tracing_attempts.
4

1 回答 1

1

您可以使用Cloud TPU Profiler 1.5.2再试一次吗?

pip install cloud-tpu-profiler==1.5.2

Cloud TPU profiler 1.6.0 和 worker 列表功能仅在 tensorflow 的当前 master 分支中支持,而使用以下命令时向后兼容 tf-1.8 capture_tpu_profile —service_addr=10.240.1.2 —logdir=${model_dir }

于 2018-05-16T23:35:16.453 回答