java - 一种“产品”的推荐引擎

Question

我们正在建立的市场允许人们列出他们要出售的东西，但以批量/袋子/盒子的形式出售。我们正在为此建立一个推荐引擎，但那里的大多数文章似乎更适合“销售”大量产品的市场 - 即亚马逊、Netflix 等。因为每个列表都有些独特，所以推荐引擎的最佳方法？有没有相关的文章？

我们知道人们过去购买过的物品。我们知道他们正在寻找的大小或年龄适宜性。

列出的捆绑包具有类别、品牌、尺寸/年龄、颜色和自由格式文本。

有什么想法可以帮助我们开始吗？如果我们的数据存储在 MySQL 中，您认为哪种特定语言最好？

score 2 · Accepted Answer

There are several things you can filter with a recommendation engine. You can filter on what a particular user has bought before (in your case, which features have been present in the products they have bought). You can also filter on social groupings--users like them, or on product groupings--other products like the ones you have sold before. I'd recommend that you first cluster the products, and then map the individual or groups to the features in that cluster of products. So, you'll end up with a recommendation engine that says: people who bought items with this feature also bought items with these features. Then, you can create an engine for known users: you tend to buy products with these features, here are some more items like those. Finally, you can build an engine for groups: people like this tend to buy products with these features.

With several models in hand, your system can turn to the appropriate one, depending on what they know at the moment: known user, known user group, or just known browsing history.

Since you are recommending batches of more unique products, you'll want to add an additional model after you get your recommendations that will filter out inappropriate recommendations. This model will represent compatibility. A new game using the same console that the user used before is more compatible than another console. If they bought a new car last month, you wouldn't recommend a new car, but maybe a package of ten car washes.

You could use several different concepts for this last model. If you are going to add explicit knowledge to your model that's in people's heads, you may want to build a belief network that filters out inappropriate recommendations. If you are going to use collective intelligence, you could use simple regression, a support vector machine, or an artificial neural network. I would go with the easiest to implement filter and not worry about choosing the first model you build. You'll probably build a handful of models before you settle on one giving you good results with appropriate effort.

Your filtering model will go through a test phase where you make a recommendation, filter it for appropriateness, then filter it again with some sort of human intervention--a set of "answers" you want your filter to learn, or just a human being double-checking results. Then you'll retrain your filter with the updated results, resample and test again.

As far as the recommendation engine goes, you can do SVD with the GNU scientific library (bindings available for about any platform). You could also choose the Mahout recommendation engine (part of the Hadoop world) if you are going to be using big data. For the filter, you may want to look at apophenia, libsvm, or FANN.

You could also choose to work in an analytics framework for a while until you feel like you've got a handle on things. Some to choose from are Weka, R, Octave, Matlab, Maple, and Mathematica. I think I've listed those in terms of price first, then ease of use.

As far as resources, there are a few good introductory books: Collective Intelligence, Mahout (MEAP from Manning), Data Mining (all about Weka), and Modeling with Data (apophenia reference).

My last thought is that however sophisticated you do or don't get with your recommendation engine, most of the value is in the user experience. One of the people from Amazon wrote that their recommendation engines worked best when they told the user why they were making a recommendation. That helps the user quickly adopt your reasoning (an emotive response to their old and good purchase), or reject it and keep going (they already have something like that, they don't need another one).

score 0 · Accepted Answer

我个人更喜欢 Ruby，但 Ruby、Python 和 Perl 可以轻松连接到 MySQL。

我喜欢 Ruby 的原因之一是它的Sequel gem，它是一个非常强大的 ORM，使数据库访问非常容易管理。如果你使用 MVC，Ruby 有Rails，它支持ActiveRecord作为它的 ORM，这也使得与 MySQL 的对话变得容易。还有Sinatra和 Padrino，它们是重量更轻的 ORM，但也非常强大。它们开箱即用，与数据库无关，并且与 Sequel 很好地集成。

java - 一种“产品”的推荐引擎

2 回答 2

Related

Reference