2

我有一个非常复杂的查询,它在 CASE 语句中使用了一些子查询。

对于这个问题,不需要完整的查询,只会阻止人们快速解决问题。

所以这篇文章使用伪代码来处理。如果需要,我可以发布查询,但它是一个怪物,对这个问题没有用。

我想要的是 CASE 语句中的可缓存子查询。

SELECT * FROM posts posts
INNER JOIN posts_shared_to shared_to
      ON shared_to.post_id = posts.post_id
INNER JOIN channels.channels 
      ON channels.channel_id = shared_to.channel_id
WHERE posts.parent_id IS NULL
AND MATCH (post.text) AGAINST (:keyword IN BOOLEAN MODE) 
AND CASE(
    WHEN channel.read_access IS NULL THEN 1
    WHEN channel.read_access = 1 THEN
    (
      SELECT count(*) FROM channel_users 
      WHERE user_id = XXX AND channel_id = channels.channel_id
    ) 
    WHEN shared_to.read_type = 2 THEN
    (
      /* another subquery with a join */
      /* check if user is in friendlist of post_author */
    )
   ELSE 0
   END;
)
GROUP BY post.post_id
ORDER BY post.post_id
DESC LIMIT n,n

如上所述,这只是一个简化的伪代码。

MySql EXPLAIN 说 CASE 中所有使用的子查询都是依赖的,这意味着(如果我是正确的)它们需要每次都运行并且不会被缓存。

欢迎任何有助于加快此查询的解决方案。

编辑部分:现在真正的查询如下所示:

SELECT a.id, a.title, a.message AS post_text, a.type, a.date, a.author AS uid, 
b.a_name as name, b.avatar, 
shared_to.to_circle AS circle_id, shared_to.root_circle,
c.circle_name, c.read_access, c.owner_uid, c.profile,
MATCH(a.title,a.message) AGAINST (:keyword IN BOOLEAN MODE) AS score

FROM posts a 

/** get userdetails for post_author **/
JOIN authors b ON b.id = a.author

/** get circles posts was shared to **/
JOIN posts_shared_to shared_to ON shared_to.post_id = a.id AND shared_to.deleted IS NULL

/** 
* get circle_details note: at the moment shared_to can contain NULL and 1 too and doesnt need to be a circle_id 
* if to_circle IS NULL post was shared public
* if to_circle = 1 post was shared to private circles
* since we use md5 keys as circle ids this can be a string insetad of (int) ... ugly.. 
*
**/
LEFT JOIN circles c ON c.circle_id = shared_to.to_circle 
    /*AND c.circle_name IS NOT NULL */
    AND ( c.profile IS NULL OR c.profile = 6 OR c.profile = 1 ) 
    AND c.deleted IS NULL

LEFT JOIN (
    /** if post is within a channel that requires membership we use this to check if requesting user is member **/
    SELECT COUNT(*) users_count, user_id, circle_id FROM circles_users
    GROUP BY user_id, circle_id
    ) counts ON counts.circle_id = shared_to.to_circle
             AND counts.user_id = :me

LEFT JOIN (
    /** if post is shared private we check if requesting users exists within post authors private circles **/
    SELECT count(*) in_circles_count, ci.owner_uid AS circle_owner, cu1.user_id AS user_me 
    FROM circles ci 
    INNER JOIN circles_users cu1 ON cu1.circle_id = ci.circle_id 
                                 AND cu1.deleted IS NULL 
    WHERE ci.profile IS NULL AND ci.deleted IS NULL
    GROUP BY user_me, circle_owner
) users_in_circles ON users_in_circles.user_me = :me 
                   AND users_in_circles.circle_owner = a.id

/** make sure post is a topic **/
WHERE a.parent_id IS NULL AND a.deleted IS NULL

/** search title and post body **/
AND MATCH (a.title,a.message) AGAINST (:keyword IN BOOLEAN MODE) 

AND (
    /** own circle **/
    c.owner_uid = :me
    /** site member read_access ( this query is for members, for guests we use a different query ) **/
    OR ( c.read_access = 1 OR c.read_access = "1" )
    /** public read_access **/
    OR ( shared_to.to_circle IS NULL OR ( c.read_access IS NULL AND c.owner_uid IS NOT NULL ) )
    /** channel/circle member read_access**/
    OR ( c.read_access = 3 OR c.read_access = "3" AND counts.users_count > 0 )
    /** for users within post creators private circles **/
    OR ( 
    ( 
    /** use shared_to to determine if post is private **/
    shared_to.to_circle = "1" OR shared_to.to_circle = 1 
    /** use circle settings to determine global privacy **/
    OR ( c.owner_uid IS NOT NULL AND c.read_access = 2 OR c.read_access = "2" )
    ) AND users_in_circles.circle_owner = a.author AND users_in_circles.user_me = :me
    )
)

GROUP BY a.id ORDER BY a.id DESC LIMIT n,n

问题:这真的是更好的方法吗?如果我查看派生表可以包含多少行,我不确定。

也许有人可以帮助我更改@Ollie-Jones 提到的查询:

SELECT stuff, stuff, stuff
  FROM (
         SELECT post.post_id
           FROM your whole query
          ORDER BY post_id DESC
          LIMIT n,n
       ) ids
  JOIN whatever ON whatever.post_id = ids.post_id
  JOIN whatelse ON whatelse

对不起,如果这听起来很懒惰,但我并不是一个真正的 mysqlguy,而且我多年来一直因为构建这个查询而感到头疼。:D

4

1 回答 1

2

消除依赖子查询的最佳方法是重构它,使其成为一个虚拟表(一个独立的子查询),然后将其 JOIN 或 LEFT JOIN 连接到其余的表中。

在您的情况下,您有

     SELECT count(*) FROM channel_users 
      WHERE user_id = XXX AND channel_id = channels.channel_id

所以,这个的独立子查询转换是

                   SELECT COUNT(*) users_count,
                          user_id, channel_id
                    FROM channel_users
                   GROUP BY user_id, channel_id

user_id您是否看到该虚拟表如何为和的每个不同组合包含一行channel_id?每行都有users_count您需要的值。然后,您可以将其加入查询的其余部分,就像这样。(注意 MySQL 中的 INNER JOIN === JOIN,所以我用 JOIN 来缩短一点。)

SELECT * FROM posts posts
  JOIN posts_shared_to shared_to ON shared_to.post_id = posts.post_id
  JOIN channels.channels  ON channels.channel_id = shared_to.channel_id
  LEFT JOIN (
                   SELECT COUNT(*) users_count,
                          user_id, channel_id
                    FROM channel_users
                   GROUP BY user_id, channel_id
       ) counts ON counts.channel_id = shared_to.channel_id
               AND counts.user_id = channels.user_id
  LEFT JOIN (  /* your other refactored subquery */
            ) friendcounts ON whatever
 WHERE posts.parent_id IS NULL
   AND channels.user_id = XXX
   AND MATCH (post.text) AGAINST (:keyword IN BOOLEAN MODE) 
   AND (          channel.read_access IS NULL
               OR (channel.read_access = 1 AND counts.users_count > 0)
               OR (shared_to.read_type = AND friendcount.users_count > 0)
       )
 GROUP BY post.post_id
 ORDER BY post.post_id DESC
 LIMIT n,n

MySQL 查询计划器通常足够聪明,可以生成每个独立子查询的适当子集。

专业提示: SELECT lots of columns ... ORDER BY something LIMIT n通常被认为是一种浪费的反模式。它会降低性能,因为它会对一大堆数据列进行排序,然后丢弃大部分结果。

专业提示: SELECT *在 JOIN 中查询也很浪费。如果您在结果集中提供您实际需要的列的列表,您的情况会好得多。

因此,您可以再次重构您的查询

    SELECT stuff, stuff, stuff
      FROM (
             SELECT post.post_id
               FROM your whole query
              ORDER BY post_id DESC
              LIMIT n,n
           ) ids
      JOIN whatever ON whatever.post_id = ids.post_id
      JOIN whatelse ON whatelse.

这个想法是只对post_id值进行排序,然后使用 LIMITed 子集来提取您需要的其余数据。

于 2016-04-09T12:42:32.707 回答