我记得 lightfm 的一个优点是模型不会出现冷启动问题,用户和项目都冷启动:lightfm 原论文
但是,我仍然不明白如何使用 lightfm 来解决冷启动问题。我在user-item interaction data
. 据我了解,我只能对我的数据集上存在的 profile_ids 进行预测。
def predict(self, user_ids, item_ids, item_features=None,
user_features=None, num_threads=1):
"""
Compute the recommendation score for user-item pairs.
Arguments
---------
user_ids: integer or np.int32 array of shape [n_pairs,]
single user id or an array containing the user ids for the
user-item pairs for which a prediction is to be computed
item_ids: np.int32 array of shape [n_pairs,]
an array containing the item ids for the user-item pairs for which
a prediction is to be computed.
user_features: np.float32 csr_matrix of shape [n_users, n_user_features], optional
Each row contains that user's weights over features.
item_features: np.float32 csr_matrix of shape [n_items, n_item_features], optional
Each row contains that item's weights over features.
num_threads: int, optional
Number of parallel computation threads to use. Should
not be higher than the number of physical cores.
Returns
-------
np.float32 array of shape [n_pairs,]
Numpy array containing the recommendation scores for pairs defined
by the inputs.
"""
self._check_initialized()
if not isinstance(user_ids, np.ndarray):
user_ids = np.repeat(np.int32(user_ids), len(item_ids))
assert len(user_ids) == len(item_ids)
if user_ids.dtype != np.int32:
user_ids = user_ids.astype(np.int32)
if item_ids.dtype != np.int32:
item_ids = item_ids.astype(np.int32)
n_users = user_ids.max() + 1
n_items = item_ids.max() + 1
(user_features,
item_features) = self._construct_feature_matrices(n_users,
n_items,
user_features,
item_features)
lightfm_data = self._get_lightfm_data()
predictions = np.empty(len(user_ids), dtype=np.float64)
predict_lightfm(CSRMatrix(item_features),
CSRMatrix(user_features),
user_ids,
item_ids,
predictions,
lightfm_data,
num_threads)
return predictions
任何有助于我理解的建议或指示将不胜感激。谢谢你