3

您好我正在尝试修改 mnist 示例以将其与我的数据集匹配。我只尝试使用 mlp 示例,它给出了一个奇怪的错误。

Tha 数据集是一个 2100 行 17 列的矩阵,输出应该是 16 个可能的类之一。该错误似乎发生在培训的第二阶段。模型构建正确(确认日志信息)。

这是错误日志:

ValueError:y_i 值超出范围

应用导致错误的节点:

CrossentropySoftmaxArgmax1HotWithBias(Dot22.0, b, 目标)

拓扑排序指数:33

输入类型:[TensorType(float64, matrix), TensorType(float64, vector), >TensorType(int32, vector)]

输入形状:[(100, 17), (17,), (100,)]

输入步幅:[(136, 8), (8,), (4,)]

输入值:['未显示','未显示','未显示']

输出客户端:[[Sum{acc_dtype=float64}(CrossentropySoftmaxArgmax1HotWithBias.0)], [CrossentropySoftmax1HotWithBiasDx(Assert{msg=' smand dydo not have the same shape.'}.0, CrossentropySoftmaxArgmax1HotWithBias.1, targets)], []]

提示:在禁用大多数 Theano 优化的情况下重新运行可以让您回溯该节点的创建时间。这可以通过 > 设置 Theano 标志 'optimizer=fast_compile' 来完成。如果这不起作用,>Theano 优化可以用 'optimizer=None' 禁用。提示:使用 Theano 标志 'exception_verbosity=high' 作为此应用节点的调试打印和存储映射占用空间。

这是代码:

def build_mlp(input_var=None):
    l_in = lasagne.layers.InputLayer(shape=(None, 16),
                                 input_var=input_var)

    # Apply 20% dropout to the input data:
    l_in_drop = lasagne.layers.DropoutLayer(l_in, p=0.2)

    # Add a fully-connected layer of 800 units, using the linear rectifier, and
    # initializing weights with Glorot's scheme (which is the default anyway):
    l_hid1 = lasagne.layers.DenseLayer(
        l_in_drop, num_units=10,
        nonlinearity=lasagne.nonlinearities.rectify,
        W=lasagne.init.GlorotUniform())

    # We'll now add dropout of 50%:
    l_hid1_drop = lasagne.layers.DropoutLayer(l_hid1, p=0.5)

    # Another 800-unit layer:
    l_hid2 = lasagne.layers.DenseLayer(
        l_hid1_drop, num_units=10,
        nonlinearity=lasagne.nonlinearities.rectify)

    # 50% dropout again:
    l_hid2_drop = lasagne.layers.DropoutLayer(l_hid2, p=0.5)

    # Finally, we'll add the fully-connected output layer, of 10 softmax units:
    l_out = lasagne.layers.DenseLayer(
        l_hid2_drop, num_units=17,
        nonlinearity=lasagne.nonlinearities.softmax)

    # Each layer is linked to its incoming layer(s), so we only need to pass
    # the output layer to give access to a network in Lasagne:
    return l_out

def main(model='mlp', num_epochs=300):
    # Load the dataset
    print("Loading data...")
    X_train, y_train, X_val, y_val, X_test, y_test = load_dataset()

    # Prepare Theano variables for inputs and targets
    input_var = T.matrix('inputs')
    target_var = T.ivector('targets')

    # Create neural network model (depending on first command line parameter)
    print("Building model and compiling functions...")
    if model == 'cnn':
        network = build_cnn(input_var)
    elif model == 'mlp':
        network = build_mlp(input_var)
    elif model == 'lstm':
        network = build_lstm(input_var)
    else:
        print("Unrecognized model type %r." % model)

    # Create a loss expression for training, i.e., a scalar objective we want
    # to minimize (for our multi-class problem, it is the cross-entropy loss):
    prediction = lasagne.layers.get_output(network)
    loss = lasagne.objectives.categorical_crossentropy(prediction, target_var)
    loss = loss.mean()
    # We could add some weight decay as well here, see lasagne.regularization.

    # Create update expressions for training, i.e., how to modify the
    # parameters at each training step. Here, we'll use Stochastic Gradient
    # Descent (SGD) with Nesterov momentum, but Lasagne offers plenty more.
    params = lasagne.layers.get_all_params(network, trainable=True)
    updates = lasagne.updates.nesterov_momentum(
        loss, params, learning_rate=0.01, momentum=0.9)

    # Create a loss expression for validation/testing. The crucial difference
    # here is that we do a deterministic forward pass through the network,
    # disabling dropout layers.
    test_prediction = lasagne.layers.get_output(network, deterministic=True)
    test_loss = lasagne.objectives.categorical_crossentropy(test_prediction,
                                                        target_var)
    test_loss = test_loss.mean()
    # As a bonus, also create an expression for the classification accuracy:
    test_acc = T.mean(T.eq(T.argmax(test_prediction, axis=1), target_var),
                  dtype=theano.config.floatX)

    # Compile a function performing a training step on a mini-batch (by giving
    # the updates dictionary) and returning the corresponding training loss:
    train_fn = theano.function([input_var, target_var], loss, updates=updates)

    # Compile a second function computing the validation loss and accuracy:
    val_fn = theano.function([input_var, target_var], [test_loss, test_acc])

    # Finally, launch the training loop.
    print("Starting training...")
    # We iterate over epochs:
    for epoch in range(num_epochs):
        # In each epoch, we do a full pass over the training data:
        train_err = 0
        train_batches = 0
        start_time = time.time()
        for batch in iterate_minibatches(X_train, y_train, 100, shuffle=True):
            inputs, targets = batch
            train_err += train_fn(inputs, targets)
            train_batches += 1

        # And a full pass over the validation data:
        val_err = 0
        val_acc = 0
        val_batches = 0
        for batch in iterate_minibatches(X_val, y_val, 100, shuffle=False):
            inputs, targets = batch
            err, acc = val_fn(inputs, targets)
            val_err += err
            val_acc += acc
            val_batches += 1

        # Then we print the results for this epoch:
        print("Epoch {} of {} took {:.3f}s".format(
        epoch + 1, num_epochs, time.time() - start_time))
        print("  training loss:\t\t{:.6f}".format(train_err / train_batches))
        print("  validation loss:\t\t{:.6f}".format(val_err / val_batches))
        print("  validation accuracy:\t\t{:.2f} %".format(
        val_acc / val_batches * 100))

    # After training, we compute and print the test error:
    test_err = 0
    test_acc = 0
    test_batches = 0
    for batch in iterate_minibatches(X_test, y_test, 100, shuffle=False):
        inputs, targets = batch
        err, acc = val_fn(inputs, targets)
        test_err += err
        test_acc += acc
        test_batches += 1
    print("Final results:")
    print("  test loss:\t\t\t{:.6f}".format(test_err / test_batches))
    print("  test accuracy:\t\t{:.2f} %".format(
    test_acc / test_batches * 100))
4

1 回答 1

3

我发现了问题:我的数据集没有每个目标的输出,因为它太小了!有 17 个目标输出,但我的数据集只有 16 个不同的输出,并且缺少第 17 个输出的示例。

为了解决这个问题,只要用rectify改变softmax,

由此:

l_out = lasagne.layers.DenseLayer(
    l_hid2_drop, num_units=17,
    nonlinearity=lasagne.nonlinearities.softmax)

对此:

l_out = lasagne.layers.DenseLayer(
    l_hid2_drop, num_units=17,
    nonlinearity=lasagne.nonlinearities.rectify)
于 2015-10-28T15:00:49.660 回答