python - lightfm 错误：并非所有估计的参数都是有限的，您的模型可能已经发散

Question

我正在运行这个非常简单的代码：

def csr_values_analysis(values):
   num_zeros = 0
   num_ones = 0
   num_other = 0

   for v in values:
       if v == 0:
           num_zeros += 1
       elif v == 1:
           num_ones += 1
       else:
           num_other += 1

   return num_zeros, num_ones, num_other


print("Reading user_features.npz")
with open("/path/to/user_features.npz", "rb") as in_file:
    user_features_csr = sp.load_npz(in_file)
    print("User features read, shape: {}".format(user_features_csr.shape))
    print("Data values analysis: zeros: %i, ones: %i, other: %i" % csr_values_analysis(user_features_csr.data))

print("Reading item_features.npz")
with open("/path/to/item_features.npz", "rb") as in_file:
    item_features_csr = sp.load_npz(in_file)
    print("Item features read, shape: {}".format(item_features_csr.shape))
    print("Data values analysis: zeros: %i, ones: %i, other: %i" % csr_values_analysis(item_features_csr.data))

print("Reading interactions.npz")
with open("/path/to/interactions.npz", "rb") as in_file:
    interactions_csr = sp.load_npz(in_file)
    print("Interactions read, shape: {}".format(interactions_csr.shape))
    print("Data values analysis: zeros: %i, ones: %i, other: %i" % csr_values_analysis(interactions_csr.data))
    interactions_coo = interactions_csr.tocoo()

# Run lightfm

print("Running lightfm...")
model = LightFM(loss='warp')
model.fit(interactions_coo, user_features=user_features_csr, item_features=item_features_csr, epochs=20, num_threads=2, verbose=True)

使用以下输出：

Reading user_features.npz
User features read, shape: (827568, 105)
Data values analysis: zeros: 0, ones: 3153032, other: 0
Reading item_features.npz
Item features read, shape: (67339359, 36)
Data values analysis: zeros: 0, ones: 25259081, other: 0
Reading interactions.npz
Interactions read, shape: (827568, 67339359)
Data values analysis: zeros: 0, ones: 172388, other: 0
Running lightfm...
Epoch 0
Traceback (most recent call last):
  File "training.py", line 92, in <module>
    model.fit(interactions_coo, user_features=user_features_csr, item_features=item_features_csr, epochs=20, num_threads=2, verbose=True)
  File "/usr/lib64/python3.6/site-packages/lightfm/lightfm.py", line 479, in fit
    verbose=verbose)
  File "/usr/lib64/python3.6/site-packages/lightfm/lightfm.py", line 578, in fit_partial
    self._check_finite()
  File "/usr/lib64/python3.6/site-packages/lightfm/lightfm.py", line 413, in _check_finite
    raise ValueError("Not all estimated parameters are finite,"
ValueError: Not all estimated parameters are finite, your model may have diverged. Try decreasing the learning rate or normalising feature values and sample weights

我所有的 Scipy 稀疏矩阵都被归一化（即值是0或1）。

我试图改变学习计划和学习率，但没有结果。

我已经检查过，仅当我将项目特征添加到方程式时才会发生这种情况。仅使用交互或交互 + 用户功能运行 lightfm 时没有错误。

AFAIK，我已经安装了最新版本：

$ pip freeze | grep lightfm
lightfm==1.15

任何的想法？谢谢！

更新 1

我想知道我的稀疏矩阵是否太稀疏了......不过，我尝试过非常少的形状，并且出现了同样的错误：

>>> import scipy.sparse as sp
>>> import numpy as np
>>> import lightfm
>>> uf_row = np.array([2,4,9])
>>> uf_col = np.array([4,9,3])
>>> uf_data = np.array([1,1,1])
>>> if_row = np.array([0,3])
>>> if_col = np.array([9,7])
>>> if_data = np.array([1,1])
>>> i_row = np.array([1])
>>> i_col = np.array([8])
>>> i_data = np.array([1])
>>> uf_csr = sp.csr_matrix((uf_data, (uf_row, uf_col)), shape=(10, 10))
>>> if_csr = sp.csr_matrix((if_data, (if_row, if_col)), shape=(10, 10))
>>> i_csr = sp.csr_matrix((i_data, (i_row, i_col)), shape=(10, 10))
>>> model = lightfm.LightFM(loss='warp')
>>> model.fit(i_csr.tocoo(), user_features=uf_csr, item_features=if_csr)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python3.6/site-packages/lightfm/lightfm.py", line 479, in fit
    verbose=verbose)
  File "/usr/lib64/python3.6/site-packages/lightfm/lightfm.py", line 578, in fit_partial
    self._check_finite()
  File "/usr/lib64/python3.6/site-packages/lightfm/lightfm.py", line 413, in _check_finite
    raise ValueError("Not all estimated parameters are finite,"
ValueError: Not all estimated parameters are finite, your model may have diverged. Try decreasing the learning rate or normalising feature values and sample weights

确实，我做错了什么...

更新 2

我想我发现了问题......我做了以下实验：

>>> uf_csr = sp.csr_matrix((np.array([1]),(np.array([0]), np.array([0]))),shape=(20,20))
>>> if_csr = sp.csr_matrix((np.array([1]),(np.array([0]), np.array([0]))),shape=(20,20))
>>> i_csr = sp.csr_matrix((np.array([1]),(np.array([1]), np.array([1]))),shape=(20,20))
>>> model = lightfm.LightFM(loss='warp')
>>> model.fit(i_csr.tocoo(), user_features=uf_csr, item_features=if_csr, epochs=20)
Traceback (most recent call last):
  ...
ValueError: Not all estimated parameters are finite, your model may have diverged. Try decreasing the learning rate or normalising feature values and sample weights

即我像往常一样有例外。现在，如果你观察交互矩阵，它有一个关于用户和项目的交互，在用户和项目特征矩阵中，它们的所有特征分别设置为 0。所以，让我们在用户特征矩阵中改变它，例如：

>>> uf_csr = sp.csr_matrix((np.array([1,1]),(np.array([0,1]), np.array([0,0]))),shape=(20,20))
>>> model.fit(i_csr.tocoo(), user_features=uf_csr, item_features=if_csr, epochs=20)
<lightfm.lightfm.LightFM object at 0x7f2d39ea3490>

瞧！

我们可以对项目特征矩阵做同样的事情：

>>> uf_csr = sp.csr_matrix((np.array([1]),(np.array([0]), np.array([0]))),shape=(20,20))
>>> if_csr = sp.csr_matrix((np.array([1,1]),(np.array([0,1]), np.array([0,0]))),shape=(20,20))
>>> model.fit(i_csr.tocoo(), user_features=uf_csr, item_features=if_csr, epochs=20)
<lightfm.lightfm.LightFM object at 0x7f2d39ea3490>

因此，我将尝试找到一种过滤与全零用户和项目功能相关的交互的方法，然后我会发布它；）

score 0 · Accepted Answer

正如我在上次查询更新中所解释的那样，问题在于用户和项目的所有功能都设置为零，并且同时发生与其中一个用户和其中一个项目相关的交互。

话虽如此，我的第一个想法是删除与这些用户和项目相关的交互，但这可能会影响对这些用户的推荐，或者对这些项目的推荐。

因此，不同的解决方案可能是使用对角矩阵扩展用户和项目特征矩阵，以便至少将这样的特征（用户他/她自己）设置为 1。

0 0 0        0 0 0 1 0 0
0 1 0   -->  0 1 0 0 1 0
1 0 0        1 0 0 0 0 1

python - lightfm 错误：并非所有估计的参数都是有限的，您的模型可能已经发散

1 回答 1

Related

Reference