我正在尝试在 AWS 实例上运行 Keras 脚本。虽然该脚本在我自己的计算机上运行良好(Python 2.7 - 无 CPU),但它会在启用 GPU 的 AWS 实例上导致一些错误。我已经安装了最新版本的 Theano - 并且其他脚本(例如 mnist 教程)不会出错。
导致问题的脚本是标准的 Keras 教程脚本 ( https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py )。根据 Stack Overflow 的指导,我已经将边界模式改为“有效”——这似乎解决了一个问题。但是 - 我立即遇到以下问题(下面的错误堆栈)。我在 bash "THEANO_FLAGS=optimizer=fast_compile,device=gpu,floatX=float32 cifar10.py" 处运行了以下行,但这并没有提供更多信息。也许我应该转向 nolearn / lasagne 包 - 但如果有简单的方法来解决这个问题,请告诉我。
Using Theano backend.
Using gpu device 0: GRID K520 (CNMeM is disabled)
X_train shape: (50000, 3, 32, 32)
50000 train samples
10000 test samples
Using real time data augmentation
----------------------------------------
Epoch 0
----------------------------------------
Training...
Testing...
Traceback (most recent call last):
File "keras_python_4.py", line 152, in <module>
score = model.test_on_batch(X_batch, Y_batch)
File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 445, in test_on_batch
return self._test(ins)
File "/usr/local/lib/python2.7/dist-packages/keras/backend/theano_backend.py", line 357, in __call__
return self.function(*inputs)
File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 871, in __call__
storage_map=getattr(self.fn, 'storage_map', None))
File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 314, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 859, in __call__
outputs = self.fn()
ValueError: GpuElemwise. Input dimension mis-match. Input 1 (indices start at 0) has shape[0] == 32, but the output's size on that axis is 16.
Apply node that caused the error: GpuElemwise{Composite{(i0 * log(clip((i1 / i2), i3, i4)))}}[(0, 0)](GpuFromHost.0, GpuSoftmaxWithBias.0, GpuDimShuffle{0,x}.0, CudaNdarrayConstant{[[ 1.00000001e-07]]}, CudaNdarrayConstant{[[ 0.99999988]]})
Toposort index: 95
Inputs types: [CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, col), CudaNdarrayType(float32, (True, True)), CudaNdarrayType(float32, (True, True))]
Inputs shapes: [(16, 10), (32, 10), (32, 1), (1, 1), (1, 1)]
Inputs strides: [(10, 1), (10, 1), (1, 0), (0, 0), (0, 0)]
Inputs values: ['not shown', 'not shown', 'not shown', <CudaNdarray object at 0x7f2165bb7730>, <CudaNdarray object at 0x7f2165bb7970>]
Outputs clients: [[GpuCAReduce{add}{0,1}(GpuElemwise{Composite{(i0 * log(clip((i1 / i2), i3, i4)))}}[(0, 0)].0)]]
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.