math - How do I calculate popularity of content?

Question

I'm developing a web site where the user rates content (1-5 stars). I need to measure the popularity of the content (also referred to as importance/hotness/interest). My first thought was just to add the user ratings for a content:

Popularity = SUM(Rating - 2.5)

If two users gives it 5-stars and one gives it 2 stars it gets popularity of 2.5+2.5-0.5 = 4.5. The value then gets dampened depending on how old the content is. I want it to be as accurate as possible so I'm wondering if this is "good enough" or if there is a better way by e.g. analyzing the distribution of ratings, or if I must bring in more metrics (views, comments, shares, time spent on content etc.).

score 3 · Accepted Answer

有点经典的问题，这个。你的方法很好，但它是否考虑到分数的可靠性？你暗示那不是。

帖子获得的评分越多，评分就越可靠地告诉您价值。

另一方面，一个单一的差评级是不可信的。

能够解释您的数据集的可靠性，并通过计算它告诉我们的信息是统计学中的贝叶斯的全部内容。您需要一个贝叶斯平均值：在此处查看这些文章，并在此处查看一组优秀的资源。

由于这是一个堆栈溢出问题，因此这里是关于如何计算平均值的许多典型 SO 问题之一。

如果您想发现这个古老的金块的历史和哲学维度，这是一本好书。

score 1 · Accepted Answer

首先，流行度不是一个定义明确的概念。有人可能会认为它与收视率成正比，但我也可以说“电影A很受欢迎，因为每个人都看过它，但它的质量却没有预期的那么好。”。这样一来，收视率就很多了，但总体来说收视率不是太好。

以一种天真的方式，您可以测量每部电影的收视率与全局平均值的平均偏移量。

以更复杂的方式，您还应该考虑有多少评级，这很难制定。

通常，如果您正在构建推荐系统，您会使用项目相似度或用户相似度等。这是因为它们是相对的。默认情况下，流行度应该是有界的绝对规模，这很难为推荐制定正确的公式。

如果您要使用推荐系统，我建议您阅读以下论文：

http://www.grouplens.org/node/475

math - How do I calculate popularity of content?

2 回答 2

Related

Reference