tensorflow - 使用 MirroredStrategy() 卡住模型

Question

我尝试使用来自 Tensorflow 的 MirroredStrategy() 在 Keras 中使用多个 GPU(2)。但是，它会导致以下错误：

Epoch 1/5
WARNING:tensorflow:From /home/user/conda36/lib/python3.6/site-packages/tensorflow/python/data/ops/multi_device_iterator_ops.py:601: get_next_as_optional (from tensorflow.python.data.ops.iterator_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Iterator.get_next_as_optional()` instead.
2020-09-06 18:50:54.766930: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-09-06 18:50:55.069925: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2020-09-06 18:50:56.049400: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at depthwise_conv_op.cc:386 : Invalid argument: Computed output size would be negative: -1 [input_size: 3, effective_filter_size: 5, stride: 1]

此时模型被冻结。这意味着它仍在运行，但它什么也不做。

如果我在没有 MirroredStrategy() 的情况下运行它，它完全可以正常工作，但当然它只使用 1 个 GPU。

我像这样使用 MirroredStrategy() ：

def get_model():
    .
    .
    .
    decoded = Conv2D(1, (3, 3), activation='linear', padding='same')(d)

    autoencoder = Model(input_img, decoded)
    autoencoder.summary()
    autoencoder.compile(optimizer='Adagrad', loss='mean_squared_error')
    return autoencoder

if __name__ == '__main__':
    # Model
    strategy = tf.distribute.MirroredStrategy()
    with strategy.scope():
        autoencoder = get_model()

可能是什么错误？

tensorflow - 使用 MirroredStrategy() 卡住模型

0 回答 0

Related

Reference