1

有没有办法让 2GB 显卡运行对象检测?主板上有 24GB DD3 Ram,我不能在 GPU 上也使用它吗?

我确实尝试在trainer.py中添加session_config.gpu_options.allow_growth=True但这没有帮助。看来显卡内存不够了。

卡信息:

0, name: GeForce GTX 650, pci bus id: 0000:01:00.0)
[name: "/cpu:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 4876955943962853047
, name: "/gpu:0"
device_type: "GPU"
memory_limit: 1375862784
locality {
  bus_id: 1
}
incarnation: 4236842880144430162
physical_device_desc: "device: 0, name: GeForce GTX 650, pci bus id: 0000:01:00.0"
]

train.py 输出:

Limit:                   219414528
InUse:                   192361216
MaxInUse:                192483072
NumAllocs:                    6030
MaxAllocSize:              6131712

2017-09-13 13:47:13.429510: W tensorflow/core/common_runtime/bfc_allocator.cc:277] ****************************************************************************************____________
2017-09-13 13:47:13.481829: W tensorflow/core/framework/op_kernel.cc:1192] Internal: Dst tensor is not initialized.
     [[Node: prefetch_queue_Dequeue/_5471 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_5476_prefetch_queue_Dequeue", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InternalError'>, Dst tensor is not initialized.
     [[Node: prefetch_queue_Dequeue/_5471 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_5476_prefetch_queue_Dequeue", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
2017-09-13 13:47:13.955327: W tensorflow/core/framework/op_kernel.cc:1192] Internal: Dst tensor is not initialized.
     [[Node: prefetch_queue_Dequeue/_299 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_3432_prefetch_queue_Dequeue", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
2017-09-13 13:47:13.956056: W tensorflow/core/framework/op_kernel.cc:1192] Internal: Dst tensor is not initialized.
     [[Node: prefetch_queue_Dequeue/_299 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_3432_prefetch_queue_Dequeue", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1327, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1306, in _run_fn
    status, run_metadata)
  File "/usr/lib/python3.5/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
     [[Node: prefetch_queue_Dequeue/_5471 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_5476_prefetch_queue_Dequeue", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 198, in <module>
    tf.app.run()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "train.py", line 194, in main
    worker_job_name, is_chief, FLAGS.train_dir)
  File "/home/dee/Documents/projects/tensor/models/object_detection/trainer.py", line 297, in train
    saver=saver)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/slim/python/slim/learning.py", line 755, in train
    sess, train_op, global_step, train_step_kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/slim/python/slim/learning.py", line 488, in train_step
    run_metadata=run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 895, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1124, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1321, in _do_run
    options, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1340, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
4

1 回答 1

0

实际上,该Dst tensor is not initialized消息表明您的 GPU 内存不足。您可以尝试将批量大小降至最低,并降低您输入模型的图像的分辨率。也尝试使用 SSD Mobilenet 模型,因为它非常轻巧。

要回答您问题的第二部分:我一直认为现代 GPU 将进入混合模式,其中驱动程序/GPU 开始通过 PCIe 总线从系统 RAM 流式传输资源,以弥补“缺失”的 VRAM。由于系统 RAM 比 GDDR5 慢 3-5 倍,延迟更高,用完 VRAM 将转化为显着的性能损失。但是,我在配备 6GB VRAM 的 GTX 1060 上遇到了同样的问题,其中 CUDA 进程因为 GPU 用完而崩溃。

于 2017-09-15T13:00:26.780 回答