If you think your sample data is complete for computing the item-item similarities, why don't you precompute them and use Collection<GenericItemSimilarity.ItemItemSimilarity> corrMatrix = new ArrayList<GenericItemSimilarity.ItemItemSimilarity>();
to store your precomputed similarities. Then from this you can create your ItemSimilarity
like this: ItemSimilarity similarity = new GenericItemSimilarity(correlationMatrix);
I think it is not good idea for using sample of your data for computing item-item similarities based on the preference values, because you might be missing a lot of useful data. If you think that computing it on the fly is slow, you can always precomputed it and store it in a database, and load it when needed.
If you are still getting this error, than you probably use your sample data model in the recommendation class, or you use UserSimilarity
to compute the item similarities.
If you want to add new user you can either use Mahout's FileDataModel
and update the file periodically by including new users (I think you can create new file with some suffix, I am not sure). You can find more about this in the book Mahout in Action. The in-memory DataModel
implementations are immutable. You can extend them by implementing the methods setPreference()
and removePreference()
.
EDIT: I have an implementation for MutableDataModel
that extends the AbstractDataModel
. I can share it with you if you want.