1

I am working on Mahout and found an issue when I tried to change my csv, previously it was giving me proper recommendations.

Example code:

model = new FileDataModel(new File("E:\\WriteTest.csv"));
UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
UserNeighborhood neighborhood = new NearestNUserNeighborhood(2,similarity,model);   
Recommender recomender = new  GenericUserBasedRecommender(model,neighborhood, similarity);

List<RecommendedItem> recommendations = recomender.recommend(1,1);

for(RecommendedItem recommendation: recommendations){
    System.out.println(recommendation);
}

I have just updated the values of my csv and it has stopped giving me suggestion.

CSV that is not giving me any result:

1,13,9.9
1,26,9.0
1,40,4.0
2,83,9.9
2,167,9.0
2,250,4.0
3,91,9.9
3,167,9.0
3,274,4.0
4,91,9.9
4,167,2.0

CSV which is giving me result:

1,101,5.0
1,102,3.0
1,103,3.0

2,101,5.0
2,102,2.5
2,103,3.0
2,104,2.1

3,101,5.0
3,102,2.5
3,105,4.0
3,107,5.0

4,102,2.0
4,104,4.0
4,105,2.5
4,106,3.0
4,107,2.6

5,101,5.0
5,102,3.4
5,104,2.5
5,105,2.5
5,106,1.0

Output on console respectively:

Result from 1st Dataset Aug 27, 2011 2:45:06 AM org.slf4j.impl.JCLLoggerAdapter info INFO: Creating FileDataModel for file WriteTest.csv Aug 27, 2011 2:45:06 AM org.slf4j.impl.JCLLoggerAdapter info INFO: Reading file info... Aug 27, 2011 2:45:06 AM org.slf4j.impl.JCLLoggerAdapter info INFO: Readlines: 11 Aug 27, 2011 2:45:06 AM org.slf4j.impl.JCLLoggerAdapter info INFO: Processed 4 users

I was expecting Item no 167 but din't find any recommendation

Output of 2nd dataset:

Aug 27, 2011 2:52:42 AM org.slf4j.impl.JCLLoggerAdapter info
INFO: Creating FileDataModel for file WriteTest.csv
Aug 27, 2011 2:52:42 AM org.slf4j.impl.JCLLoggerAdapter info
INFO: Reading file info...
Aug 27, 2011 2:52:42 AM org.slf4j.impl.JCLLoggerAdapter info
INFO: Read lines: 21
Aug 27, 2011 2:52:42 AM org.slf4j.impl.JCLLoggerAdapter info
INFO: Processed 5 users
RecommendedItem[item:105, value:3.25]
4

1 回答 1

2

推荐器工作正常。问题是您的数据太稀疏。它找不到可以链接两个用户的相似性,因此 167 是可推荐的。尝试一个更真实的数据集,我认为这种行为看起来不会那么令人惊讶。

于 2011-08-27T12:11:17.747 回答