neural-network - 使用 Y_True 作为中间层的输入

Question

我正在尝试实现一个类似于以下框图的结构。我有能力从头开始实现它，但是当我想在 Keras 中实现它时，我遇到了一些困难。任何帮助，将不胜感激。具体来说，我有两个关于它在 Keras 中的实施的问题。

1）我怎样才能将我的实际输出作为一个单独的输入层，如下面的框图所示。由于每个输入都被输入到网络中，我想要图中显示的 Y_true 部分中的相应黄金标准输出。
2）如果我想从成本部分反向传播成本函数，是否可以从垂直路径而不是具有第三层副本的路径向后传播。

score 1 · Accepted Answer

我尝试了自定义损失函数，这是可能的，但它比平时复杂一点（我不知道训练是否会成功......）：

import keras.backend as K

def customLoss(yTrue,yPred): 

    #starting with tensors shaped like (batch,5,3)

    #let's find the predicted class to compare - this example works with categorical classification (only one true class per element in a sequence)   
    trueMax = K.argmax(yTrue,axis=-1)
    predMax = K.argmax(yPred,axis=-1)
                #at this point, shapes become (batch,5)

    #let's find the different results:
    neq = K.not_equal(trueMax,predMax)

    #now we sum the different results. The ones with sum=0 are true
    neqsum = K.sum(neq,axis=-1)
                #shape now is only (batch)

    #to avoid false values being greater than 1, we do another comparison:
    trueFalse = K.equal(neqsum,0)

    #we adjust from values between 0 and 1 to values between -1 and 1:
    adj = (2*trueFalse) - 1

    #now it's time to create Loss1 and Loss2 (which I don't know)   
    #they are different from regular losses, because you must keep the batch size so you can multiply the result with "adj":

    l1 = someLoss keeping batch size   
    l2 = someLoss keeping batch size
              #these two must be shaped also like (batch)

    #then apply your formula:
    res = ((1-adj)*l1 + ((adj-1)*l2)
               #this step could perhaps be replaced by the K.switch function    
               #it would be probably much more efficient, but I'd have to learn how to use it first   

    #and finally, sum over the batch dimension, or use a mean value or anything similar
    return K.sum(res) #or K.mean(res)

一个测试（形状有点不同，但保持相同的维度数）：

def tprint(t):
    print(K.shape(t).eval())
    print(t.eval())
    print("\n")

x = np.array([[[.2,.7,.1],[.6,.3,.1],[.3,.3,.4],[.6,.3,.1],[.3,.6,.1]],[[.5,.2,.3],[.3,.6,.1],[.2,.7,.1],[.7,.15,.15],[.5,.2,.3]]])
y = np.array([[[0.,1.,0.],[1.,0.,0.],[0.,0.,1.],[1.,0.,0.],[0.,1.,0.]],[[0.,1.,0.],[0.,0.,1.],[0.,1.,0.],[1.,00.,00.],[1.,0.,0.]]])


x = K.variable(x)
y = K.variable(y)

xM = K.argmax(x,axis=-1)
yM = K.argmax(y,axis=-1)

neq = K.not_equal(xM,yM)

neqsum = K.sum(neq,axis=-1,keepdims=False)
trueFalse = K.equal(neqsum,0)
adj = (2*trueFalse) - 1

l1 = 3 * K.sum(K.sum(y,axis=-1),axis=-1)
l2 = 7 * K.sum(K.sum(y,axis=-1),axis=-1)

res = ((1-adj)*l1) +((adj-1)*l2)
sumres = K.sum(res) #or K.mean, or something similar
tprint(xM)
tprint(yM)
tprint(neq)
tprint(neqsum)
tprint(trueFalse)
tprint(adj)
tprint(l1)
tprint(l2)
tprint(res)

score 1 · Accepted Answer

请试试这个。主要思想是创建一个具有 2 个输出的模型，一个用于 y_pred，一个用于损失。编译该模型时，使用损失函数列表，我们只关心第二个损失

from keras.models import Model
from keras.layers import Dense, Input
from keras.layers.merge import _Merge
from keras import backend as K
import numpy as np

class CustomMerge(_Merge):
    def _merge_function(self, inputs):
        output = inputs[0]
        for i in range(1, len(inputs)):
            output += inputs[i]
        return output

class CustomLoss(_Merge):

    def _merge_function(self, inputs):
        output = inputs[0]
        for i in range(1, len(inputs)):
            output -= inputs[i]
        return output


input = Input(name= 'input', shape=[100])
y_true = Input(name = 'y_true', shape=[1])
layer1 = Dense(1024)(input)
layer2 = Dense(128)(layer1)
layer3 = Dense(1)(layer2)

y_pred = CustomMerge()([layer3, y_true]) # do whatever you want to calculate y_pred
loss = CustomLoss()([layer3, y_pred]) # do whatever you want to calculate loss

model = Model(inputs=[input, y_true], outputs = [y_pred, loss])
losses = [
            lambda y_true, y_pred: K.zeros([1]),  # don't care about this loss
            lambda y_true, y_pred: K.mean(K.square(y_pred), axis=-1),  # we only care about this loss and just care about y_pred, no matter what the y_true is.
        ]
model.compile(loss=losses, optimizer='adam')
model.summary()

batch_size = 32

X, Y = get_batch(batch_size)
L = np.zeros(batch_size)

model.train_on_batch([X, Y], [Y, L])

neural-network - 使用 Y_True 作为中间层的输入

2 回答 2

Related

Reference