1

我有 4 张桌子:

Table talks
table talks_fan
table talks_follow
table talks_comments

我想要实现的是计算每一次谈话的所有评论、粉丝、追随者。

到目前为止,我想出了这个。

tables具有talk_id且仅在talks表中是主键

SELECT
  g. *, 
  COUNT( m.talk_id ) AS num_of_comments,
  COUNT( f.talk_id ) AS num_of_followers

FROM
  talks AS g

LEFT JOIN talks_comments AS m
  USING ( talk_id )

LEFT JOIN talks_follow AS f
  USING ( talk_id )

WHERE g.privacy = 'public'
GROUP BY g.talk_id
ORDER BY g.created_date DESC 
LIMIT 30;

我也尝试过使用这种方法

SELECT
  t.*,
  COUNT(b.talk_id) AS comments, 
  COUNT(bt.talk_id) AS followers 
FROM
  talks t
LEFT JOIN talks_follow bt
  ON bt.talk_id = t.talk_id
LEFT JOIN talks_comments b
  ON b.talk_id = t.talk_id
GROUP BY t.talk_id;

两者都给我相同的结果......?!

更新:创建语句

CREATE TABLE IF NOT EXISTS `talks` (
`talk_id` bigint(20) NOT NULL AUTO_INCREMENT,
`user_id` mediumint(9) NOT NULL,
`title` varchar(255) NOT NULL,
`content` text NOT NULL,
`created_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`privacy` enum('public','private') NOT NULL DEFAULT 'private',
PRIMARY KEY (`talk_id`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=7 ;

 CREATE TABLE IF NOT EXISTS `talks_comments` (
`comment_id` bigint(20) NOT NULL AUTO_INCREMENT,
`talk_id` bigint(20) NOT NULL,
`user_id` mediumint(9) NOT NULL,
`comment` text NOT NULL,
`date_created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`status` tinyint(1) NOT NULL DEFAULT '0',
 PRIMARY KEY (`comment_id`)
 ) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=8 ;

 CREATE TABLE IF NOT EXISTS `talks_fan` (
`fan_id` bigint(20) NOT NULL AUTO_INCREMENT,
`talk_id` bigint(20) NOT NULL,
`user_id` bigint(20) NOT NULL,
`created_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`status` tinyint(1) NOT NULL DEFAULT '1',
PRIMARY KEY (`fan_id`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=4 ;

CREATE TABLE IF NOT EXISTS `talks_follow` (
`follow_id` bigint(20) NOT NULL AUTO_INCREMENT,
`talk_id` bigint(20) NOT NULL,
`user_id` mediumint(9) NOT NULL,
`date_created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE       CURRENT_TIMESTAMP,
PRIMARY KEY (`follow_id`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=5 ;

有效的最终查询

SELECT t.* ,  COUNT( DISTINCT b.comment_id ) AS comments, 
            COUNT( DISTINCT bt.follow_id ) AS followers, 
            COUNT( DISTINCT c.fan_id ) AS fans
FROM talks t

LEFT JOIN talks_follow bt ON bt.talk_id = t.talk_id
LEFT JOIN talks_comments b ON b.talk_id = t.talk_id
LEFT JOIN talks_fan c ON c.talk_id = t.talk_id

WHERE t.privacy = 'public'
GROUP BY t.talk_id
ORDER BY t.created_date DESC 
LIMIT 30

编辑:整个问题的最终答案......

我已经修改了查询并在 PHP (Codeigniter) 中创建了一些代码来解决我的问题 apone 推荐 @Bill Karwin

        $sql="
    SELECT t.*,
                    COUNT( DISTINCT b.comment_id ) AS comments, 
                    COUNT( DISTINCT bt.follow_id ) AS followers, 
                    COUNT( DISTINCT c.fan_id ) AS fans,
                    GROUP_CONCAT( DISTINCT c.user_id ) AS list_of_fans
    FROM talks t

    LEFT JOIN talks_follow bt ON bt.talk_id = t.talk_id
    LEFT JOIN talks_comments b ON b.talk_id = t.talk_id
    LEFT JOIN talks_fan c ON c.talk_id = t.talk_id

    WHERE t.privacy = 'public'
    GROUP BY t.talk_id
    ORDER BY t.created_date DESC 
    LIMIT 30
    ";

    $query = $this->db->query($sql);
    if($query->num_rows() > 0)
    {

        $results = array();

        foreach($query->result_array() AS $talk){
            $fan_user_id = explode(",", $talk['list_of_fans']);
            foreach($fan_user_id AS $user){
                 if($user == 1 /* this supposed to be user id or session*/){
                     $talk['list_of_fans'] = 'yes';
                 }
            }

            $follower_user_id = explode(",", $talk['list_of_follower']);
            foreach($follower_user_id AS $user){
                 if($user == 1 /* this supposed to be user id or session*/){
                     $talk['list_of_follower'] = 'yes';
                 }
            }

             $results[] = array(
                    'talk_id'           => $talk['talk_id'], 
                    'user_id'           => $talk['user_id'],
                    'title'             => $talk['title'], 
                    'created_date'      => $talk['created_date'], 
                    'comments'          => $talk['comments'], 
                    'followers'         => $talk['followers'], 
                    'fans'              => $talk['fans'], 
                    'list_of_fans'      => $talk['list_of_fans'],
                    'list_of_follower'  => $talk['list_of_follower']                        
                    );

        }
    }

我仍然相信可以在数据库中对其进行优化,然后使用结果...

我在想,如果每个 TALK 有 1000 名追随者和 2000 名粉丝,那么加载结果将需要更长的时间。如果你将 NO 乘以 10。或者我误会听到...

编辑:为查询测试添加基准...

我已经使用 codeigniter profiler 来了解查询完成执行需要多长时间。

据说我也开始逐渐在表格中添加数据

结果如下。

在将数据回答到数据库后测试数据库

Query Results time

table Talks
---------------
table data 50 rows.
Time: 0.0173 seconds

Table Rows: 644 rows
Time: 0.0535 seconds

Table Rows: 1250 rows
Time: 0.0856 seconds


Adding data to other tables
--------------------------
Talks = 1250 rows
talks_follow = 4115
talks_fan = 10 rows

Time: 2.656 seconds

Adding data to other tables
--------------------------
Talks = 1250 rows
talks_follow = 4115
talks_fan = 10 rows
talks_comments = 3650 rows

Time: 10.156 seconds

After replacing LEFT JOIN with STRAIGHT_JOIN

Time: 6.675 seconds

似乎它对 DB 的负担非常重.....现在我正面临如何提高其性能的另一个困境

编辑:使用@leonardo_assumpcao 建议

After rebuilding the DB using @leonardo_assumpcao suggestion
for indexing few fields..........


Adding data to other tables
--------------------------
Talks       = 6000  Rows
talks_follow    = 10000 Rows
talks_fan   = 10000 Rows
talks_comments  = 10000 Rows

Time: 17.940 second

这对于大量数据数据库来说是正常的吗......?

4

3 回答 3

1

I can say this is (at least) one of the coolest select statements I improved today.

SELECT STRAIGHT_JOIN
  t.* ,
  COUNT( DISTINCT b.comment_id ) AS comments, 
  COUNT( DISTINCT bt.follow_id ) AS followers, 
  COUNT( DISTINCT c.fan_id )     AS fans

FROM
  (
    SELECT * FROM talks
    WHERE privacy = 'public'
    ORDER BY created_date DESC
    LIMIT 0, 30
  ) AS t

LEFT JOIN talks_follow   bt ON (bt.talk_id = t.talk_id)

LEFT JOIN talks_comments b  ON (b.talk_id = t.talk_id)

LEFT JOIN talks_fan      c  ON (c.talk_id = t.talk_id)

GROUP BY t.talk_id ;

But it seems to me that your problem resides on your tables; A first step to obtain efficient queries is to index every field involved on your desired joins.

I've made some modifications on the tables you shown above; You can see its code here (updated).
Quite interesting, isn't it? Since we're here, take also your ERR model:

Tables

First try it using MySQL test database. Hopefully it will solve your performance troubles.

(Forgive my english, it's my second language)

于 2013-04-13T19:25:42.577 回答
0

您可以将其强制转换为一个查询,如下所示:

SELECT COUNT(*) num, 'talks' item         FROM talks
UNION
SELECT COUNT(*) num, 'talks_fan' item     FROM talks_fan
UNION
SELECT COUNT(*) num, 'talks_follow' item  FROM talks_follow
UNION
SELECT COUNT(*) num, 'talks_comment' item FROM talks_comment

这将为您提供一个五行结果集,每个表一行。每行是特定表中的计数。

如果你必须把它全部放在一行中,你可以像这样做一个枢轴。

SELECT 
  SUM( CASE item WHEN 'talks'         THEN num ELSE 0 END ) AS 'talks', 
  SUM( CASE item WHEN 'talks_fan'     THEN num ELSE 0 END ) AS 'talks_fan', 
  SUM( CASE item WHEN 'talks_follow'  THEN num ELSE 0 END ) AS 'talks_follow', 
  SUM( CASE item WHEN 'talks_comment' THEN num ELSE 0 END ) AS 'talks_comment'
FROM 
(   SELECT COUNT(*) num, 'talks' item         FROM talks
    UNION
    SELECT COUNT(*) num, 'talks_fan' item     FROM talks_fan
    UNION
    SELECT COUNT(*) num, 'talks_follow' item  FROM talks_follow
    UNION
    SELECT COUNT(*) num, 'talks_comment' item FROM talks_comment
) counts

(这没有考虑到您的WHERE g.privacy =子句,因为我不明白这一点。但是您可以WHERE在项目中的四个查询之一中添加一个子句UNION来处理它。)

请注意,这实际上是将四个单独表上的四个查询强制转换为一个查询。

而且,顺便说一句,表的主键 COUNT(*)COUNT(id)何时是值没有区别。不计算is的行,但如果是主键,那么它是。但是速度更快,所以使用它。idCOUNT(id)idNULLidNOT NULLCOUNT(*)

如果您需要每个不同谈话的粉丝、关注和评论行数,请进行编辑,请执行此操作。做一个联合和一个枢轴的想法是一样的,但是有一个额外的参数。

SELECT 
      talk_id, 
      SUM( CASE item WHEN 'talks_fan'     THEN num ELSE 0 END ) AS 'talks_fan', 
      SUM( CASE item WHEN 'talks_follow'  THEN num ELSE 0 END ) AS 'talks_follow', 
      SUM( CASE item WHEN 'talks_comment' THEN num ELSE 0 END ) AS 'talks_comment'
FROM 
(   
          SELECT talk_id, COUNT(*) num, 'talks_fan' item     
            FROM talks_fan
        GROUP BY talk_id
    UNION
         SELECT talk_id, COUNT(*) num, 'talks_follow' item  
           FROM talks_follow
       GROUP BY talk_id
    UNION
         SELECT talk_id, COUNT(*) num, 'talks_comment' item 
           FROM talks_comment
       GROUP BY talk_id
) counts
GROUP BY talk_id

在这样做(太)多年之后,我发现描述您需要的查询的最佳方式是对自己说“我需要一个结果集,每个 xxx 有一行,yyy、zzz 和QQ。”

于 2013-04-12T01:38:15.420 回答
0

计数相同的原因是它在连接合并表之后对行进行计数。通过加入多个表,您正在创建笛卡尔积

基本上,您不仅要计算每次谈话有多少评论,还要计算每次谈话有多少评论 * 关注者。然后你将追随者计算为每次谈话有多少追随者*评论。因此,计数是相同的,而且它们都太高了。

这是编写查询以仅对每个不同的评论、关注者等计数一次的更简单方法:

SELECT t.*, 
  COUNT(DISTINCT b.comment_id) AS comments, 
  COUNT(DISTINCT bt.follow_id) AS followers 
FROM talks t
LEFT JOIN talks_follow bt ON bt.talk_id = t.talk_id
LEFT JOIN talks_comments b ON b.talk_id = t.talk_id
GROUP BY t.talk_id;

回复您的评论:我不会在同一个查询中获取所有关注者。你可以这样做:

SELECT t.*, 
  COUNT(DISTINCT b.comment_id) AS comments, 
  COUNT(DISTINCT bt.follow_id) AS followers, 
  GROUP_CONCAT(DISTINCT bt.follower_name) AS list_of_followers
FROM talks t
LEFT JOIN talks_follow bt ON bt.talk_id = t.talk_id
LEFT JOIN talks_comments b ON b.talk_id = t.talk_id
GROUP BY t.talk_id;

但是您会得到一个字符串,其中的追随者名称用逗号分隔。现在您必须编写应用程序代码来用逗号分割字符串,您必须担心某些追随者名称是否实际上已经包含逗号,等等。

我会做第二个查询,为给定的谈话获取关注者。无论如何,您可能只想显示特定谈话的关注者。

SELECT follower_name
FROM talks_follow
WHERE talk_id = ?
于 2013-04-12T02:05:58.697 回答