我阅读了网上所有关于人们忘记将目标向量更改为矩阵的问题的帖子,并且由于更改后问题仍然存在,我决定在这里提出我的问题。下面提到了解决方法,但出现了新问题,感谢您的建议!
使用卷积网络设置和带有 sigmoid 激活函数的二元交叉熵,我得到了一个维度不匹配的问题,但不是在训练数据期间,只有在验证/测试数据评估期间。出于某种奇怪的原因,我的验证集向量中的他的维度被切换了,我不知道为什么。如上所述,训练效果很好。代码如下,非常感谢您的帮助(很抱歉劫持了线程,但我没有看到创建新线程的理由),其中大部分是从千层面教程示例中复制的。
解决方法和新问题:
- 在 valAcc 定义中删除“axis = 1”会有所帮助,但验证准确度仍然为零,并且无论我有多少节点、层、过滤器等,测试分类总是返回相同的结果。即使改变训练集的大小(我有大约 350 个样本,每个类都有 48x64 灰度图像)也不会改变这一点。所以似乎有些不对劲
网络创建:
def build_cnn(imgSet, input_var=None):
# As a third model, we'll create a CNN of two convolution + pooling stages
# and a fully-connected hidden layer in front of the output layer.
# Input layer using shape information from training
network = lasagne.layers.InputLayer(shape=(None, \
imgSet.shape[1], imgSet.shape[2], imgSet.shape[3]), input_var=input_var)
# This time we do not apply input dropout, as it tends to work less well
# for convolutional layers.
# Convolutional layer with 32 kernels of size 5x5. Strided and padded
# convolutions are supported as well; see the docstring.
network = lasagne.layers.Conv2DLayer(
network, num_filters=32, filter_size=(5, 5),
nonlinearity=lasagne.nonlinearities.rectify,
W=lasagne.init.GlorotUniform())
# Max-pooling layer of factor 2 in both dimensions:
network = lasagne.layers.MaxPool2DLayer(network, pool_size=(2, 2))
# Another convolution with 16 5x5 kernels, and another 2x2 pooling:
network = lasagne.layers.Conv2DLayer(
network, num_filters=16, filter_size=(5, 5),
nonlinearity=lasagne.nonlinearities.rectify)
network = lasagne.layers.MaxPool2DLayer(network, pool_size=(2, 2))
# A fully-connected layer of 64 units with 25% dropout on its inputs:
network = lasagne.layers.DenseLayer(
lasagne.layers.dropout(network, p=.25),
num_units=64,
nonlinearity=lasagne.nonlinearities.rectify)
# And, finally, the 2-unit output layer with 50% dropout on its inputs:
network = lasagne.layers.DenseLayer(
lasagne.layers.dropout(network, p=.5),
num_units=1,
nonlinearity=lasagne.nonlinearities.sigmoid)
return network
所有集合的目标矩阵都是这样创建的(以训练目标向量为例)
targetsTrain = np.vstack( (targetsTrain, [[targetClass], ]*numTr) );
...以及theano变量本身
inputVar = T.tensor4('inputs')
targetVar = T.imatrix('targets')
network = build_cnn(trainset, inputVar)
predictions = lasagne.layers.get_output(network)
loss = lasagne.objectives.binary_crossentropy(predictions, targetVar)
loss = loss.mean()
params = lasagne.layers.get_all_params(network, trainable=True)
updates = lasagne.updates.nesterov_momentum(loss, params, learning_rate=0.01, momentum=0.9)
valPrediction = lasagne.layers.get_output(network, deterministic=True)
valLoss = lasagne.objectives.binary_crossentropy(valPrediction, targetVar)
valLoss = valLoss.mean()
valAcc = T.mean(T.eq(T.argmax(valPrediction, axis=1), targetVar), dtype=theano.config.floatX)
train_fn = function([inputVar, targetVar], loss, updates=updates, allow_input_downcast=True)
val_fn = function([inputVar, targetVar], [valLoss, valAcc])
最后,这里有两个循环,训练和测试。第一个很好,第二个抛出错误,摘录如下
# -- Neural network training itself -- #
numIts = 100
for itNr in range(0, numIts):
train_err = 0
train_batches = 0
for batch in iterate_minibatches(trainset.astype('float32'), targetsTrain.astype('int8'), len(trainset)//4, shuffle=True):
inputs, targets = batch
print (inputs.shape)
print(targets.shape)
train_err += train_fn(inputs, targets)
train_batches += 1
# And a full pass over the validation data:
val_err = 0
val_acc = 0
val_batches = 0
for batch in iterate_minibatches(valset.astype('float32'), targetsVal.astype('int8'), len(valset)//3, shuffle=False):
[inputs, targets] = batch
[err, acc] = val_fn(inputs, targets)
val_err += err
val_acc += acc
val_batches += 1
错误(摘录)
Exception "unhandled ValueError"
Input dimension mis-match. (input[0].shape[1] = 52, input[1].shape[1] = 1)
Apply node that caused the error: Elemwise{eq,no_inplace}(DimShuffle{x,0}.0, targets)
Toposort index: 36
Inputs types: [TensorType(int64, row), TensorType(int32, matrix)]
Inputs shapes: [(1, 52), (52, 1)]
Inputs strides: [(416, 8), (4, 4)]
Inputs values: ['not shown', 'not shown']
再次感谢您的帮助!