我有一个这样的模型(非Hadoop):
DataModel data = new FileDataModel(new File("file.csv"));
UserSimilarity userSimilarity = new PearsonCorrelationSimilarity(dataModel);
userSimilarity.setPreferenceInferrer(new AveragingPreferenceInferrer(data));
UserNeighborhood userNeighborhood = new NearestNUserNeighborhood(1, userSimilarity, data);
例如,userSimilarity 在 [0,100] 之间没有标准化,所以如果我想向最终用户显示它,我使用以下解决方案:
long maxSim = userSimilarity.userSimilarity(userId1, userNeighborhood.getUserNeighborhood(userId1)[0]);
long finalSimilarity = Math.min(100, Math.max((int) Math.ceil(100 * userSimilarity.userSimilarity(userId1, userId2) / maxSim), 0))
我观察到性能问题(每个用户的不同秒数),对于每个给定用户,是否有另一种可能性或最快的方法让 min(similarity) = 0 和 max(similarity) = 100?