redis - 我应该如何在 Redis 中建模？

Question

仅供参考：Redis n00b。

我需要在我的网络应用程序中存储搜索词。

每个词都有两个属性：“search_count”（整数）和“last_searched_at”（时间）

我试过的例子：

Redis.hset("search_terms", term, {count: 1, last_searched_at: Time.now})

我可以想出几种不同的方法来存储它们，但没有很好的方法来查询数据。我需要生成的报告是“过去 30 天内的热门搜索词”。在 SQL 中，这将是 where 子句和 order by。

我将如何在 Redis 中做到这一点？我应该使用不同的数据类型吗？

提前致谢！

score 4 · Accepted Answer

I would consider two ordered sets.

When a search term is submitted, get the current timestamp and:

zadd timestamps timestamp term
zincrby counts 1 term

The above two operations should be atomic.

Then to find all terms in the given time interval timestamp_from, timestamp_to:

zrangebyscore timestamps timestamp_from timestamp_to

after you get these, loop over them and get the counts from counts.

Alternatively, I am curious whether you can use zunionstore. Here is my test in Ruby:

require 'redis'

KEYS = %w(counts timestamps results)
TERMS = %w(test0 keyword1 test0 test1 keyword1 test0 keyword0 keyword1 test0)

def redis
  @redis ||= Redis.new
end

def timestamp
  (Time.now.to_f * 1000).to_i
end

redis.del KEYS

TERMS.each {|term|
  redis.multi {|r|
    r.zadd 'timestamps', timestamp, term
    r.zincrby 'counts', 1, term
  }
  sleep rand
}

redis.zunionstore 'results', ['timestamps', 'counts'], weights: [1, 1e15]

KEYS.each {|key|
  p [key, redis.zrange(key, 0, -1, withscores: true)]
}

# top 2 terms
p redis.zrevrangebyscore 'results', '+inf', '-inf', limit: [0, 2]

EDIT: at some point you would need to clear the counts set. Something similar to what @Eli proposed (https://stackoverflow.com/a/16618932/410102).

score 2 · Accepted Answer

取决于您要优化的内容。假设您希望能够非常快速地运行该查询并且不介意消耗一些内存，我会这样做如下。

为您看到一些搜索的每一秒保留一个键（如果您愿意，您可以或多或少地细化）。键应该指向一个哈希值，$search_term -> $count其中 $count 是在那一秒内看到 $search_term 的次数。
为您想要数据的每个时间间隔（我们称之为 $time_int_key）保留另一个键（在您的情况下，这只是您的间隔是最后 30 天的一个键）。这应该指向一个排序集，其中集合中的项目是您在过去 30 天内看到的所有搜索词，它们排序的分数是它们在过去 30 天内被看到的次数。
让后台工作人员每秒抓取 30 天前恰好发生的那一秒的密钥，并循环遍历附加到它的哈希值。对于该键中的每个 $search_term，它应该从 $time_int_key 中与该 $search_term 关联的分数中减去 $count

这样，您可以ZRANGE $time_int_key 0 $m在 O(log(N)+m) 时间内获取 m 个热门搜索（如果您想要搜索的数量，则为 [WITHSCORES]）。这足够便宜，可以在 Redis 中以任何合理的 m 尽可能频繁地运行，并始终实时更新该数据。

redis - 我应该如何在 Redis 中建模？

2 回答 2

Related

Reference