5

我的 mysql 数据库在尝试执行特别慢的查询时变得很耗 CPU。当我解释时,mysql说“使用where;使用临时;使用文件排序”。请帮助破译和解决这个难题。

表结构:

CREATE TABLE `topsources` (
  `USER_ID` varchar(255) NOT NULL,
   `UPDATED_TIME` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `URL_ID` int(11) NOT NULL,
  `SOURCE_SLUG` varchar(100) NOT NULL,
  `FEED_PAGE_URL` varchar(255) NOT NULL,
  `CATEGORY_SLUG` varchar(100) NOT NULL,
  `REFERRER` varchar(2048) DEFAULT NULL,
  PRIMARY KEY (`USER_ID`,`URL_ID`),
  KEY `USER_ID` (`USER_ID`),
  KEY `FEED_PAGE_URL` (`FEED_PAGE_URL`),
  KEY `SOURCE_SLUG` (`SOURCE_SLUG`),
  KEY `CATEGORY_SLUG` (`CATEGORY_SLUG`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

该表有 370K 行......有时更高。下面的查询需要 10 多秒。

SELECT topsources.SOURCE_SLUG, COUNT(topsources.SOURCE_SLUG) AS VIEW_COUNT
FROM topsources
WHERE CATEGORY_SLUG = '/newssource'
GROUP BY topsources.SOURCE_SLUG
HAVING MAX(CASE WHEN topsources.USER_ID = 'xxxx' THEN 1 ELSE 0 END) = 0
ORDER BY VIEW_COUNT DESC;

这是扩展解释:

+----+-------------+------------+------+---------------+---------------+---------+-------+--------+----------+----------------------------------------------+
| id | select_type | table      | type | possible_keys | key           | key_len | ref   | rows   | filtered | Extra                                        |
+----+-------------+------------+------+---------------+---------------+---------+-------+--------+----------+----------------------------------------------+
|  1 | SIMPLE      | topsources | ref  | CATEGORY_SLUG | CATEGORY_SLUG | 302     | const | 160790 |   100.00 | Using where; Using temporary; Using filesort |
+----+-------------+------------+------+---------------+----

------------+---------+--------+--------+---------+ --------------------------------------------------------------+

有没有办法改进这个查询?另外,是否有任何 mysql 设置可以帮助减少 CPU 负载?我可以分配更多可用的内存在我的服务器上。

4

4 回答 4

1

最有可能帮助查询的是 CATEGORY_SLUG 上的索引,尤其是当它采用许多值时。(也就是说,如果查询是高度选择性的。)查询需要读取整个表才能获得结果——尽管 10 秒似乎很长。

我认为 HAVING 子句不会影响查询处理。

如果连续运行两次,查询是否需要同样长的时间?

于 2012-07-05T18:11:16.560 回答
0

如果有很多行符合您的 CATEGORY_SLUG 标准,那么可能很难做到这一点,但这会更快吗?

SELECT ts.SOURCE_SLUG, COUNT(ts.SOURCE_SLUG) AS VIEW_COUNT 
FROM topsources ts
WHERE ts.CATEGORY_SLUG = '/newssource' 
  AND NOT EXISTS(SELECT 1 FROM topsources ts2
                 WHERE ts2.CATEGORY_SLUG = '/newssource'
                   AND ts.SOURCE_SLUG = TS2.SOURCE_SLUG
                   AND ts2.USER_ID = 'xxxx')
GROUP BY ts.SOURCE_SLUG 
ORDER BY VIEW_COUNT DESC;
于 2012-07-05T18:27:24.260 回答
0

当您不能自己对数据进行查询时,总是很难优化某些东西,但如果我自己做,这将是我的第一次尝试:

SELECT t.SOURCE_SLUG, COUNT(t.SOURCE_SLUG) AS VIEW_COUNT
FROM topsources t
LEFT JOIN (
    SELECT SOURCE_SLUG
    FROM topsources t
    WHERE CATEGORY_SLUG = '/newssource'
    AND USER_ID = 'xxx'
    GROUP BY .SOURCE_SLUG
) x USING (SOURCE_SLUG)
WHERE t.CATEGORY_SLUG = '/newssource'
AND x.SOURCE_SLUG IS NULL
GROUP BY t.SOURCE_SLUG
ORDER BY VIEW_COUNT DESC;
于 2012-07-06T08:23:07.510 回答
0

如果我读到我的 sql 更改正确性,那应该可以解决问题

SELECT topsources.SOURCE_SLUG, COUNT(topsources.SOURCE_SLUG) AS VIEW_COUNT
FROM topsources
WHERE CATEGORY_SLUG = '/newssource' and 
    topsources.SOURCE_SLUG not in (
        select distinct SOURCE_SLUG 
        from topsources 
        where USER_ID = 'xxxx'
        )
GROUP BY topsources.SOURCE_SLUG
ORDER BY VIEW_COUNT DESC;
于 2012-07-05T18:07:59.027 回答