我正在尝试使用两组布尔偏好数据创建一个简单的推荐引擎。我想使用一个数据集来计算 UserSimilarity 和 UserNeighborhoods,然后使用这些邻域从第二组布尔偏好数据中提出建议。
我似乎有这个工作,但问题是当我去计算推荐时,如果用户有基于第一个数据集的邻居,但不存在于第二个数据集中(尽管他们的邻居是)它不会产生任何推荐.
这是 RecommendationBuilder 代码:
recommenderBuilder = new RecommenderBuilder() {
public Recommender buildRecommender(DataModel recommendationModel) throws TasteException {
UserSimilarity similarity = new LogLikelihoodSimilarity(trainingModel);
UserNeighborhood neighborhood = new NearestNUserNeighborhood(10, 0.7, similarity, recommendationModel);
return new GenericBooleanPrefUserBasedRecommender(recommendationModel, neighborhood, similarity);
}
};
这是 trainingModel 文件的示例
1,111
2,222
2,111
2,222
3,111
3,222
和推荐模型文件
1,91
1,92
2,91
NoSuchUserException
运行此建议为用户 2 提供 92,但当它到达用户 3 时会抛出一个。
Sol... 有没有办法根据在另一个数据集上计算的相似性从一个数据集生成推荐,而无需让所有用户都出现在第二个数据集中?
这是我现在正在使用的完整代码:
private DataModel trainingModel;
private DataModel recommendationModel;
private RecommenderEvaluator evaluator;
private RecommenderIRStatsEvaluator evaluator2;
private RecommenderBuilder recommenderBuilder;
private DataModelBuilder modelBuilder;
@Override
public void afterPropertiesSet() throws IOException, TasteException {
trainingModel = new GenericBooleanPrefDataModel(
GenericBooleanPrefDataModel.toDataMap(new FileDataModel(new File("/music.csv")))
);
recommendationModel = new GenericBooleanPrefDataModel(
GenericBooleanPrefDataModel.toDataMap(new FileDataModel(new File("/movies.csv")))
);
evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator();
evaluator2 = new GenericRecommenderIRStatsEvaluator();
recommenderBuilder = new RecommenderBuilder() {
public Recommender buildRecommender(DataModel model) throws TasteException {
UserSimilarity similarity = new LogLikelihoodSimilarity(trainingModel);
UserNeighborhood neighborhood = new NearestNUserNeighborhood(10, 0.7, similarity, model);
return new GenericBooleanPrefUserBasedRecommender(model, neighborhood, similarity);
}
};
modelBuilder = new DataModelBuilder() {
public DataModel buildDataModel( FastByIDMap<PreferenceArray> trainingData ) {
return new GenericBooleanPrefDataModel( GenericBooleanPrefDataModel.toDataMap(trainingData) );
}
};
}
然后运行这个方法
@Override
public void testData() throws TasteException {
double score = evaluator.evaluate(recommenderBuilder, modelBuilder, trainingModel, 0.9, 1.0);
System.out.println("calculated score: " + score);
try {
IRStatistics stats = evaluator2.evaluate(
recommenderBuilder, modelBuilder, trainingModel, null, 2,
0.0,
1.0
);
System.out.println("recall: " + stats.getRecall());
System.out.println("precision: " + stats.getPrecision());
} catch (Throwable t) {
System.out.println("throwing " + t);
}
List<RecommendedItem> recommendations = recommenderBuilder.buildRecommender(recommendationModel).recommend(1,2);
System.out.println("user 1");
for (RecommendedItem recommendation : recommendations) { System.out.println(recommendation);}
recommendations = recommenderBuilder.buildRecommender(recommendationModel).recommend(2,2);
System.out.println("user 2");
for (RecommendedItem recommendation : recommendations) { System.out.println(recommendation);}
try {
recommendations = recommenderBuilder.buildRecommender(recommendationModel).recommend(3,2);
System.out.println("user 3");
for (RecommendedItem recommendation : recommendations) { System.out.println(recommendation);}
} catch (Throwable t) {
System.out.println("throwing " + t);
}
}
产生这个输出:
计算得分:0.7033357620239258 召回率:1.0 精度:1.0 用户 1 用户 2 推荐项目 [项目:9222,值:0.8516679] 抛出 org.apache.mahout.cf.taste.common.NoSuchUserException:3