1

当我尝试学习一些简单的 MLP 时,我得到了奇怪的结果,并且在从所有东西中剥离代码之后,除了必要的东西并缩小它之后,我仍然得到奇怪的结果。

代码

import numpy as np
import theano
import theano.tensor as T
import lasagne


dtype = np.float32
states = np.eye(3, dtype=dtype).reshape(3,1,1,3)
values = np.array([[147, 148, 135,147], [147,147,149,148], [148,147,147,147]], dtype=dtype)
output_dim = values.shape[1]
hidden_units = 50

#Network setup
inputs = T.tensor4('inputs')
targets = T.matrix('targets')

network = lasagne.layers.InputLayer(shape=(None, 1, 1, 3), input_var=inputs)
network = lasagne.layers.DenseLayer(network, 50, nonlinearity=lasagne.nonlinearities.rectify)
network = lasagne.layers.DenseLayer(network, output_dim)

prediction = lasagne.layers.get_output(network)
loss = lasagne.objectives.squared_error(prediction, targets).mean()
params = lasagne.layers.get_all_params(network, trainable=True)
updates = lasagne.updates.sgd(loss, params, learning_rate=0.01)

f_learn = theano.function([inputs, targets],  loss, updates=updates)
f_test = theano.function([inputs], prediction)


#Training
it = 5000
for i in range(it):
    l = f_learn(states, values)
    print "Loss: " + str(l)
    print "Expected:"
    print values
    print "Learned:"
    print f_test(states)
    print "Last layer weights:"
    print lasagne.layers.get_all_param_values(network)[-1]

我希望网络能够学习 'values' 变量中给出的值,并且经常这样做,但同样经常它会留下一些输出节点为零和巨大的损失。

样本输出

Loss: 5426.83349609
Expected:
[[ 147.  148.  135.  147.]
 [ 147.  147.  149.  148.]
 [ 148.  147.  147.  147.]]
Learned:
[[ 146.99993896    0. 134.99993896  146.99993896]
 [ 146.99993896    0. 148.99993896  147.99993896]
 [ 147.99995422    0. 146.99996948  146.99993896]]
Last layer weights:
[ 11.40957355   0. 11.36747837  10.98625183]

为什么会这样?

4

1 回答 1

0

我在 lasagne google 小组中问过同样的问题,我在那里更幸运:https ://groups.google.com/forum/#!topic/lasagne-users/ ock-2RqTaFk 将 recitfier 单元更改为允许负输出的非线性有助于.

于 2015-11-04T17:08:27.010 回答