2

我需要一个查询,根据用户关注的电视节目为用户找到推荐的电视节目。为此,我有以下表格:

  • Progress包含用户正在关注的节目的表和看过的剧集的百分比(为了解决这个问题,我们可以假设我在数据库中只有一个用户)

  • Suggested包含_id1,_id2和的表value(值是 id=_id1的节目和id= 的节目之间的联系强度_id2value越多越好,节目的共同点越多)。请注意,在此表中应用了交换性质,因此 和 之间的连接强度与id1_id2相同。此外,没有两行,例如 ROW1._id1=ROW2._id2 AND ROW1._id2 = ROW2._id1_id1_id2

  • 包含有关电视节目的详细信息的表ShowCache,例如名称等。

以下查询是我想要做的,但结果是一个空集:

SET @a = 0;   //In other tests this line seem to be necessary
SELECT `ShowCache`.*,
       (SUM(value) * (Progress.progress)) as priority
FROM `Suggested`,`ShowCache`, Progress 
WHERE 

    ((_id2 = Progress.id AND _id1 NOT IN (SELECT id FROM Progress) AND @a:=_id1)//There is a best way to set a variable here?

    OR

    (_id1 = Progress.id AND _id2 NOT IN (SELECT id FROM Progress) AND @a:=_id2))

    AND  `ShowCache`._id = @a   //I think that the query fails here

GROUP BY `ShowCache`._id 
ORDER BY priority DESC 
LIMIT 0,20

我知道问题与变量的使用有关,但我无法解决。非常感谢任何帮助。


PS:主要问题是(由于可交换性),没有变量我需要两个查询,开始执行大约需要 3 秒(查询比上面的更复杂)。我真的在尝试做一个查询来完成这项任务

PPS:我还绑定了 XOR 操作,导致无限循环?!?!?这是我尝试过的 WHERE 子句:

((_id2=Progress.id AND @a:=_id1) XOR (_id1=Progress.id AND @a:=_id2)) AND `ShowCache`._id = @a

编辑:我在不使用任何变量的情况下提出了这个 WHERE 条件:

(_id2 = Progress.id OR _id1 = Progress.id) 
AND `ShowCache`._id = IF(_id2 = Progress.id, _id1,_id2)
AND  `ShowCache`._id NOT IN (SELECT id FROM Progress)

它可以工作,但速度很慢。

4

2 回答 2

1

Your attempt to use xor is clever. If you want to get the nonmatching value you want to use bitwise XOR which is ^

Progress.id ^_id1 ^ _id2

3 ^ 2 ^ 3 = 2

2 ^ 2 ^ 3 = 3

You can use this trick to setup a join and really simplify your query (eliminate the OR's and NOT IN's and do it in one query without variables.)


select users.name as username, showcache.name as show_name, 
  sum(progress * value) as priority  from users
inner join progress on users.id = progress.user_id
inner join suggested on progress.show_id in (suggested.id_1, suggested.id_2)
inner join showcache on showcache.id = 
  (suggested.id_1 ^ suggested.id_2 ^ progress.show_id)
where showcache.id  not in 
  (select show_id from progress where user_id = users.id)
group by showcache.id
order by priority desc;

I also setup a fiddle to demonstrate it: http://sqlfiddle.com/#!2/2dcd8/24

To break it down. I created a users table with a single user (but the solution will work with multiple users.)

The select and join to progress is straightforward. The join to suggested uses IN as an alternative to writing it with OR

The join to showcache is where the bitwise XOR happens. One of the id's links up to the progress.show_id and we want to use the other one.

It does include a not in to exclude shows already watched from the results. I could have changed it to not exists? but it seems clearer this way.

于 2013-04-23T16:24:17.583 回答
0

您在 where 子句中设置了两次 @a 的值,这意味着查询实际上归结为:

...
WHERE ... AND `ShowCache`._id = _id2

MySQL 以第一次遇到的顺序评估变量分配,因此您应该将 @a 常量留到子句的 END 之前,然后分配一个新值,例如

mysql> set @a=5;
mysql> select @a, @a+1, @a*5, @a := @a + 1, @a;
+------+------+------+--------------+------+
| @a   | @a+1 | @a*5 | @a := @a + 1 | @a   |
+------+------+------+--------------+------+
|    0 |    1 |    0 |            1 |    1 |
|    1 |    2 |    5 |            2 |    2 |
|    2 |    3 |   10 |            3 |    3 |
+------+------+------+--------------+------+

Note that @a's value in the first 3 columns remains constant, UNTIL mysql reaches the @a := @a +1, after which @a has a new value

So perhaps your query should be

set @a = 0;
select @temp := @a, ..., @a := _id2
where
   ((_id2 = Progress.id AND _id1 NOT IN (SELECT id FROM Progress) AND @temp =_id1)
   ...
etc...
于 2013-04-23T16:11:39.177 回答