1

表结构:

uid       : integer
answer_id : integer

我需要运行一个查询,它会告诉我哪个 uid 的答案与其他 uid 的答案相同。例如,这里有一些测试数据:

answer_id   uid
1           555
4           555
7           555
10          555
1           123
5           123
7           123
10          123

所以我们可以从这些数据中看到,他们每个人都以相同的方式回答了 3/4 的问题。

我正在努力编写一个查询,该查询会告诉我哪个 uid 匹配相同答案的 3/4 或 4/4。基本上,我试图找到具有 75% (3/4) 或更高 (4/4) 类似答案的用户。

这是 Ruby on Rails 应用程序的一部分,所以我构建了所有模型 [User, UserAnswers etc..] 但我假设这只是一个 SQL 查询,不一定是 ActiveRecord 的一部分

4

2 回答 2

3

此查询显示每个用户彼此共有的答案数量:

declare @uid int

select
  ans1.uid as user1,
  ans2.uid as user2,
  count(*)
from 
  ans ans1 inner join ans ans2
  on ans1.answer_id = ans2.answer_id
     and ans1.uid <> ans2.uid
where uid = @uid
group by user1, user2
having count(*)>0

这还显示了每个用户已回答的问题数量:

 select
  ans1.uid as user1,
  ans2.uid as user2,
  count(distinct ans1.answer_id) as total1,
  count(distinct ans2.answer_id) as total2,
  sum(case when ans1.answer_id = ans2.answer_id then 1 else 0 end) as common
from 
  ans ans1 inner join ans ans2 on ans1.uid <> ans2.uid
group by user1, user2
having count(*)>0

(这第二个查询可能很慢)

于 2012-11-16T17:39:51.007 回答
1

FThiella 的回答很有效。但是,不需要进行笛卡尔积连接。以下版本产生相同的计数,但没有如此复杂的连接:

select ans1.uid as user1,
       ans2.uid as user2,
       max(ans1.numanswers) as total1,
       max(ans2.numanswers) as total2,
       count(*) as common
from (select a.*, count(*) over (partition by uid) as numanswers,
      from UserAnswers a
     ) ans1 inner join
     (select a.*, count(*) over (partition by uid) as numanswers
      from UserAnswers a
     ) ans2
     on ans1.uid <> ans2.uid and
        ans1.answer_id = ans2.answer_id
group by ans1.uid, ans2.uid

与其他答案一样,这不包括没有共同答案的用户对。

于 2012-11-16T18:33:18.920 回答