python - How to apply Optimizer on Variable in Chainer?

Question

Here is an example in Pytorch:

optimizer = optim.Adam([modifier_var], lr=0.0005)

And here in Tensorflow:

self.train = self.optimizer.minimize(self.loss, var_list=[self.modifier])

But Chainer's optimizers only can use on 'Link', how can I apply Optimizer on Variable in Chainer?

score 1 · Accepted Answer

In short, there is no way to directly assign chainer.Variable (even nor chainer.Parameter) to chainer.Optimizer.

The following is some redundant explanation.

First, I re-define Variable and Parameter to avoid confusion.

Variable is (1) torch.Tensor in PyTorch v4, (2) torch.autograd.Variable in PyTorch v3, and (3) chainer.Variable in Chainer v4.
Variable is an object who holds two tensors; .data and .grad. It is the necessary and sufficient condition, so Variable is not necessarily a learnable parameter, which is a target of the optimizer.

In both libraries, there is another class Parameter, which is similar but not the same with Variable. Parameter is torch.autograd.Parameter in Pytorch and chainer.Parameter in Chainer.
Parameter must be a learnable parameter and should be optimized.

Therefore, there should be no case to register Variable (not Parameter) to Optimizer (although PyTorch allows to register Variable to Optimizer: this is just for backward compatibility).

Second, in PyTorch torch.nn.Optimizer directly optimizes Parameter, but in Chainer chainer.Optimizer DOES NOT optimize Parameter: instead, chainer.UpdateRule does. The Optimizer just registers UpdateRules to Parameters in a Link.

Therefore, it is only natural that chainer.Optimizer does not receive Parameter as its arguments, because it is just a "delivery-man" of UpdateRule.

If you want to attach different UpdateRule for each Parameter, you should directly create an instance of UpdateRule subclass, and attach it to the Parameter.

score 0 · Accepted Answer

Below is an example to learn regression task by MyChain MLP model using Adam optimizer in Chainer.

from chainer import Chain, Variable

# Prepare your model (neural network) as `Link` or `Chain`
class MyChain(Chain):
    def __init__(self):
        super(MyChain, self).__init__(
            l1=L.Linear(None, 30),
            l2=L.Linear(None, 30),
            l3=L.Linear(None, 1)
        )

    def __call__(self, x):
        h = self.l1(x)
        h = self.l2(F.sigmoid(h))
        return self.l3(F.sigmoid(h))

model = MyChain()

# Then you can instantiate optimizer
optimizer = chainer.optimizers.Adam()

# Register model to optimizer (to indicate which parameter to update)
optimizer.setup(model)

# Calculate loss, and update parameter as follows.
def lossfun(x, y):
    loss = F.mean_squared_error(model(x), y)
    return loss

# this iteration is "training", to fit the model into desired function.
for i in range(300):
    optimizer.update(lossfun, x, y)

So in summary, you need to setup the model, after that you can use update function to calculate loss and update model's parameter. The above code comes from here

Also, there are other way to write training code using Trainer module. For more detailed tutorial of Chainer, please refer below

python - How to apply Optimizer on Variable in Chainer?

2 回答 2

Related

Reference