sql - 重写 MySQL 查询

Question

我将尝试在另一个问题上更好地解释这一点。这是我认为应该可以工作的查询，但是，当然，MySQL 不支持这个特定的子选择查询：

select *
  from articles a
  where a.article_id in
      (select f.article_id
        from articles f
        where f.category_id = a.category_id
        order by f.is_sticky, f.published_at
        limit 3) /* limit isn't allowed inside a IN subquery */

我要存档的是：在文章表中，我有几篇文章用于几个类别。我需要为每个类别（任意数量的类别）获得最多三篇文章。

这是数据：

CREATE TABLE articles (
  article_id int(10) unsigned NOT NULL AUTO_INCREMENT,
  category_id int(10) unsigned NOT NULL,
  title varchar(100) NOT NULL,
  is_sticky boolean NOT NULL DEFAULT 0,
  published_at datetime NOT NULL,
  PRIMARY KEY (article_id)
);

INSERT INTO articles VALUES
(1, 1, 'foo', 0, '2009-02-06'),
(1, 1, 'bar', 0, '2009-02-07'),
(1, 1, 'baz', 0, '2009-02-08'),
(1, 1, 'qox', 1, '2009-02-09'),

(1, 2, 'foo', 0, '2009-02-06'),
(1, 2, 'bar', 0, '2009-02-07'),
(1, 2, 'baz', 0, '2009-02-08'),
(1, 2, 'qox', 1, '2009-02-09');

我要检索的是以下内容：

1, 1, qox, 1, 2009-02-09
1, 1, foo, 0, 2009-02-06
1, 1, bar, 0, 2009-02-07
1, 2, qox, 1, 2009-02-09
1, 2, foo, 0, 2009-02-06
1, 2, bar, 0, 2009-02-07

请注意“quox”是如何在其类别中跃居首位的，因为它具有粘性。

你能想出一种方法来避免子查询中的 LIMIT 吗？

谢谢

score 1 · Accepted Answer

这是您的解决方案的简化

    select *
  from articles a
  where a.article_id =
      (select f.article_id
        from articles f
        where f.category_id = a.category_id
        order by f.is_sticky, f.published_at
        limit 1) or a.article_id =
      (select f.article_id
        from articles f
        where f.category_id = a.category_id
        order by f.is_sticky, f.published_at
        limit 1, 1) or 
    a.article_id =
      (select f.article_id
        from articles f
        where f.category_id = a.category_id
        order by f.is_sticky, f.published_at
        limit 2, 1)

score 1 · Accepted Answer

看看这个名为组内配额（每组前 N 个）的代码片段。

根据您的集合的大小，提出了两种解决方案，一种使用计数，另一种使用临时表用于更大的表。

所以基本上，如果你有一个大表，在 MySQL 在子查询或类似的东西中实现 LIMIT 之前，你必须手动（好吧，或者在循环中使用动态查询）将所有类别与此处建议的解决方案之一聚合.

// 使用临时表和存储过程的解决方案：

运行一次：

DELIMITER //
CREATE PROCEDURE top_articles()
BEGIN
    DECLARE done INT DEFAULT 0;
    DECLARE catid INT;
    DECLARE cur1 CURSOR FOR SELECT DISTINCT(category_id) FROM articles;
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
    OPEN cur1;
    # This temporary table will hold all top N article_id for each category
    CREATE TEMPORARY TABLE top_articles (
        article_id int(10) unsigned NOT NULL
    );
    # Loop through each category
    REPEAT
        FETCH cur1 INTO catid;
        INSERT INTO top_articles
        SELECT article_id FROM articles
        WHERE category_id = catid
        ORDER BY is_sticky DESC, published_at
        LIMIT 3;
    UNTIL done END REPEAT;
    # Get all fields in correct order based on our temporary table
    SELECT * FROM articles WHERE article_id 
    IN (SELECT article_id FROM top_articles)
    ORDER BY category_id, is_sticky DESC, published_at;
    # Remove our temporary table
    DROP TEMPORARY TABLE top_articles;
END;
//
DELIMITER ;

然后，尝试一下：

CALL top_articles();

您应该看到您正在等待的结果。它应该适用于每个类别的任意数量的文章，并且很容易适用于任意数量的类别。这就是我得到的：

+------------+-------------+-------+-----------+---------------------+
| article_id | category_id | title | is_sticky | published_at        |
+------------+-------------+-------+-----------+---------------------+
|          5 |           1 | qox   |         1 | 2009-02-09 00:00:00 | 
|          1 |           1 | foo   |         0 | 2009-02-06 00:00:00 | 
|          2 |           1 | foo   |         0 | 2009-02-06 00:00:00 | 
|          9 |           2 | qox   |         1 | 2009-02-09 00:00:00 | 
|          6 |           2 | foo   |         0 | 2009-02-06 00:00:00 | 
|          7 |           2 | bar   |         0 | 2009-02-07 00:00:00 | 
+------------+-------------+-------+-----------+---------------------+

虽然我不知道它会如何翻译性能。它可能可以稍微优化和清理。

score 0 · Accepted Answer

我找到了一个（可怕的，可怕的）解决方法，我什至可能不应该发布，但是......

select *
  from articles a
  where a.article_id =
      (select f.article_id
        from articles f
        where f.category_id = a.category_id
        order by f.is_sticky, f.published_at
        limit 1)
union
select *
  from articles a
  where a.article_id =
      (select f.article_id
        from articles f
        where f.category_id = a.category_id
        order by f.is_sticky, f.published_at
        limit 1, 1)
union
select *
  from articles a
  where a.article_id =
      (select f.article_id
        from articles f
        where f.category_id = a.category_id
        order by f.is_sticky, f.published_at
        limit 2, 1)
order by category_id

由于我每个类别只需要三篇文章，我可以重复查询三次（而不是对每个类别重复）一次用于所有类别中的第一篇文章，一次用于第二篇文章，一次用于所有类别中的第三篇文章和连接它们全部并按类别排序。

似乎不支持 LIMIT 与 IN 结合使用，但一次检索一条记录就可以了。

如果您有更好的方法，我仍然对您的解决方案感兴趣。

谢谢

sql - 重写 MySQL 查询

3 回答 3

Related

Reference