When I wanna assign part of pre-trained model parameters to another module defined in a new model of PyTorch, I got two different outputs using two different methods.
The Network is defined as follows:
class Net:
def __init__(self):
super(Net, self).__init__()
self.resnet = torch.hub.load('pytorch/vision', 'resnet18', pretrained=True)
self.resnet = nn.Sequential(*list(self.resnet.children())[:-1])
self.freeze_model(self.resnet)
self.classifier = nn.Sequential(
nn.Dropout(),
nn.Linear(512, 256),
nn.ReLU(),
nn.Linear(256, 3),
)
def forward(self, x):
out = self.resnet(x)
out = out.flatten(start_dim=1)
out = self.classifier(out)
return out
What I want is to assign pre-trained parameters to classifier in the net module. Two different ways were used for this task.
# First way
net.load_state_dict(torch.load('model_CNN_pretrained.ptl'))
# Second way
params = torch.load('model_CNN_pretrained.ptl')
net.classifier[1].weight = nn.Parameter(params['classifier.1.weight'], requires_grad =False)
net.classifier[1].bias = nn.Parameter(params['classifier.1.bias'], requires_grad =False)
net.classifier[3].weight = nn.Parameter(params['classifier.3.weight'], requires_grad =False)
net.classifier[3].bias = nn.Parameter(params['classifier.3.bias'], requires_grad =False)
The parameters were assigned correctly but got two different outputs from the same input data. The first method works correctly, but the second doesn't work well. Could some guys point what the difference of these two methods?