1

我在 Chainer 中加载一个预训练模型:

net=chainer.links.VGG16Layers(pretrained_model='auto')

然后,我用一些数据进行前向传递并添加一个损失层:

acts = net.predict([image]).array loss=chainer.Variable(np.array(np.sum(np.square(acts-one_hot))))

现在的问题是,我怎样才能进行反向传递并获得不同层的渐变?

典型的向后方法不起作用。

4

2 回答 2

2

第 1 点。
不要调用VGGLayers.predict(),这不是用于反向传播计算的。
请改用VGGLayers.extract()

第 2 点。
不要np.square()直接np.sum()申请chainer.Variable
请使用F.square()andF.sum()代替chainer.Variable.

Point 3.
用于loss.backward()获取.grad可学习的参数。(模式 1)
用于loss.backward(retain_grad=True)获取.grad所有变量。(模式 2)
用于chainer.grad()获取.grad特定变量。(模式3)

代码:

import chainer
from chainer import functions as F, links as L
from cv2 import imread

net = L.VGG16Layers(pretrained_model='auto')
img = imread("/path/to/img")
prob = net.extract([img], layers=['prob'])['prob']  # NOT predict, which overrides chainer.config['enable_backprop'] as False
intermediate = F.square(prob)
loss = F.sum(intermediate)

# pattern 1:
loss.backward()
print(net.fc8.W.grad)  # some ndarray
print(intermediate.grad)  # None
###########################################
net.cleargrads()
intermediate.grad = None
prob.grad = None
###########################################

# pattern 2:
loss.backward(retain_grad=True)
print(net.fc8.W.grad)  # some ndarray
print(intermediate.grad)  # some ndarray

###########################################
net.cleargrads()
intermediate.grad = None
prob.grad = None
###########################################

# pattern 3:
print(chainer.grad([loss], [net.fc8.W]))  # some ndarray
print(intermediate.grad)  # None
于 2018-11-04T05:09:34.607 回答
2

如果要获取.grad输入图像,则必须将输入包装为chainer.Variable.
但是,VGGLayers.extract()不支持 的输入Variable,因此在这种情况下,您应该调用.forward()或其包装函数__call__()

import chainer
from chainer import Variable
from chainer import functions as F
from cv2 import imread
from chainer.links.model.vision import vgg

net = vgg.VGG16Layers(pretrained_model='auto')

# convert raw image (np.ndarray, dtype=uint32) to a batch of Variable(dtype=float32)
img = imread("path/to/image")
img = Variable(vgg.prepare(img))
img = img.reshape((1,) + img.shape)  # (channel, width, height) -> (batch, channel, width, height)

# just call VGG16Layers.forward, which is wrapped by __call__()
prob = net(img)['prob']
intermediate = F.square(prob)
loss = F.sum(intermediate)

# calculate grad
img_grad = chainer.grad([loss], [img])  # returns Variable
print(img_grad.array) # some ndarray
于 2018-11-06T04:38:36.810 回答