在您进一步阅读之前,请注意,在您的作业中从 StackOverflow 寻求和接受直接帮助可能违反您学校的规则,并会给您作为学生带来后果!
话虽如此,我对这个问题建模的方式如下:
import torch
U = 300 # number of users
M = 30 # number of movies
D = 4 # dimension of embedding vectors
source = torch.randint(0, 2, (U, M)) # users' ratings
X = source.transpose(0, 1) @ source # your `preprocessed_data`
# initial values for your embedding. This is what your algorithm needs to learn
v = torch.randn(M, D, requires_grad=True)
X = X.to(torch.float32) # necessary to be in line with `v`
# this is the `(viT vj − Xi,j )**2` part
loss_elementwise = (v @ v.transpose(0, 1) - X).pow(2)
# now we need to get rid of the diagonal. Notice that we can equally
# well get rid of the diagonal and the whole upper triangular part,
# as well, since both V @ V.T and source.T @ source are symmetric, so
# the upper triangular part contains just
# a mirror reflection of the lower triangular part.
# This means that we actually implement a bit different summation:
# sum(i=1,M) sum(j=1,i-1) stuff(i, j)
# instead of
# sum(i=1,M) sum(j=1,M) indicator[i̸=j] stuff(i, j)
# and get exactly half the original value
masked = torch.tril(loss_elementwise, -1)
# finally we sum it up, multiplying by 2 to make up
# for the "lost" upper triangular part
loss = 2 * masked.sum()
现在剩下要实现的是优化循环,它将使用 的梯度loss
来优化 的值v
。