tensorflow - TF-Slim：（不合理？）内存不足

Question

我正在尝试运行 TF SLIM 中的教程之一，您可以在其中使用 Inception-V3 (~104Mb) 微调花数据集。GPU 有大约 2Gb 的内存。当我的批量大小超过 8 时，我会收到一个错误，因为 GPU 内存不足。事实上，我似乎收到了几条消息，每条看起来像：

W tensorflow/core/common_runtime/bfc_allocator.cc:217] Ran out of memory trying to allocate 646.50MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.

和

W tensorflow/core/common_runtime/bfc_allocator.cc:274]     **************************************x*************************************************************
W tensorflow/core/common_runtime/bfc_allocator.cc:275] Ran out of memory trying to allocate 168.8KiB.  See logs for memory state.

现在，很可能是我的 GPU 没有足够大的 RAM。但是，2GB 似乎足以加载 ~100Mb 模型。此外，使用 Caffe，我可以毫无问题地微调 Alexnet (~400Mb)。此外，我还尝试允许 GPU 增长（根据我对使用系统 RAM 的理解）

session_config = tf.ConfigProto(allow_soft_placement=True)
session_config.gpu_options.allow_growth = True
session_config.gpu_options.allocator_type = 'BFC'"

但这似乎没有帮助。

你知道如果

a) 我做错了 b) GPU 不够大 c) TF Slim 通过构建太多内存消耗

?

谢谢，

score 0 · Accepted Answer

其他一些进程是否正在使用足够的 GPU 内存而没有多少剩余用于 tensorflow？我相信nvidia-smi会告诉您已经使用了多少 GPU 内存。

如果不是这种情况，您可能需要查看分配以了解发生了什么。请参阅有关如何从 tensorflow 记录分配的其他问题。

tensorflow - TF-Slim：（不合理？）内存不足

1 回答 1

Related

Reference