0

我正在使用 DGL 构建 LightGCN 模型。我看过其他类似的问题,但仍然找不到我的问题的解决方案。我没有在这个模型中保存或使用隐藏状态或其他中间值。

这是我定义的损失函数:

def bpr_loss(data, graph):
    final_emb = graph.ndata["Embedding"]
    user_ids, pos_items, neg_items = data[:, 0], data[:, 1], data[:, 2]
    user_emb = final_emb[user_ids.long()]
    pos_emb = final_emb[pos_items.long()]
    neg_emb = final_emb[neg_items.long()]
    pred_pos = torch.mul(user_emb, pos_emb).sum(dim=1)
    pred_neg = torch.mul(user_emb, neg_emb).sum(dim=1)
    
    loss = F.softplus(pred_neg - pred_pos).sum()

    return loss

下面是训练代码:

    for epoch in range(epochs):
        # train
        model.train()
        # forward
        train_g.ndata["Embedding"] = model(train_g.ndata["id"])
        dataloader = DataLoader(bpr_data, batch_size=bpr_batch_size, shuffle=True)
        for data in dataloader:
            batch_loss = bpr_loss(data, train_g)

            optimizer.zero_grad()
            batch_loss.backward()
            optimizer.step()

这是模型代码:

class LightGCN_full(nn.Module):

    def __init__(self, graph, emb_dim, num_layers):
        super(LightGCN_full, self).__init__()
        self.graph = graph
        self.emb = nn.Embedding(graph.num_nodes(), emb_dim)
        self.light_conv = GraphConv(emb_dim, emb_dim, weight=False, bias=False, allow_zero_in_degree=True)
        self.num_layers = num_layers
        nn.init.normal_(self.emb.weight, std=0.1)

    def forward(self, id):
        out = self.emb(id)
        final_emb = out
        for i in range(self.num_layers):
            out = self.light_conv(self.graph, out)
            final_emb += out
        final_emb /= self.num_layers + 1

        return final_emb

这是错误消息:

Traceback (most recent call last):
  File "D:\lightgcn\src\LightGCN\main_bpr.py", line 183, in <module>
    batch_loss.backward()
  File "D:\Anaconda3\envs\pytorch\lib\site-packages\torch\_tensor.py", line 307, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "D:\Anaconda3\envs\pytorch\lib\site-packages\torch\autograd\__init__.py", line 154, in backward
    Variable._execution_engine.run_backward(
RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.
4

0 回答 0