-1

我一直在努力处理以下查询(以及其他一些类似的查询),我觉得我错过了一些东西,或者我使用了错误类型的数据库或其他东西。

该查询用于获取过去 10 年在英国与特定城镇每年的新电影总数和停止放映(关闭)的电影总数。多年来,这些查询也针对许多城镇和县运行。

其他查询做类似的事情,有时UNION ALL在查询的末尾添加 a 以获取打开或关闭的记录年份。

还有一些查询是针对月度数据和季度数据而不是年度数据运行的,还有一些查询只是比较特定季度(例如第三季度)或月份(例如三月)的历史开盘/关盘。

这是将 2012 年英国与伦敦进行比较的查询:

SELECT inc.opening_year as year, inc.number_of_films as opens,
    diss.number_of_films as closures, inc.uk_films as uk_opens,
    diss.uk_films as uk_closures
FROM
(SELECT film_db.opening_year, uk.number_of_films as uk_films,
        COUNT(film_db.id_film_db) as number_of_films
    FROM film_db
    JOIN postcodes ON id_postcodes = opening_postcode_id
    JOIN towns ON id_towns = town_id AND town = 'London'
    JOIN (SELECT opening_year, COUNT(film_db.id_film_db) as number_of_films
            FROM film_db
            WHERE opening_year <= 2012 AND opening_year >= (2012 - 10)
            GROUP BY opening_year
        ) uk ON uk.opening_year = film_db.opening_year
    WHERE film_db.opening_year <= 2012 AND film_db.opening_year >= (2012 - 10)
    GROUP BY film_db.opening_year
    ORDER BY film_db.opening_year DESC
) inc
JOIN
(SELECT film_db.closing_year, uk.number_of_films as uk_films,
        COUNT(film_db.id_film_db) as number_of_films
    FROM film_db
    JOIN postcodes ON id_postcodes = postcode_id
    JOIN towns ON id_towns = town_id AND town = 'London'
    JOIN (SELECT closing_year, COUNT(film_db.id_film_db) as number_of_films
            FROM film_db
            WHERE film_db.closing_year <= 2012 AND film_db.closing_year >= (2012 - 10)
            GROUP BY film_db.closing_year
        ) uk ON uk.closing_year = film_db.closing_year
    WHERE film_db.closing_year <= 2012 AND film_db.closing_year >= (2012 - 10)
    GROUP BY film_db.closing_year
    ORDER BY film_db.closing_year DESC
) diss ON diss.closing_year = inc.opening_year

数据库SHOW CREATE TABLE输出如下:

电影数据库:

CREATE TABLE `film_db` (
  `id_film_db` int(11) NOT NULL AUTO_INCREMENT,
  `film_name` varchar(255) DEFAULT NULL,
  `category` varchar(100) DEFAULT NULL,
  `status` varchar(50) DEFAULT NULL,
  `opening_date` date DEFAULT NULL,
  `opening_year` int(4) DEFAULT NULL,
  `opening_month` int(2) DEFAULT NULL,
  `opening_quarter` int(1) DEFAULT NULL,
  `closing_date` date DEFAULT NULL,
  `closing_year` int(4) DEFAULT NULL,
  `closing_month` int(2) DEFAULT NULL,
  `closing_quarter` int(1) DEFAULT NULL,
  `datetime` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
  `postcode_id` int(4) NOT NULL DEFAULT '0',
  `opening_postcode_id` int(4) NOT NULL DEFAULT '0',
  PRIMARY KEY (`id_film_db`),
  KEY `opening_date` (`opening_date`),
  KEY `status` (`status`),
  KEY `postcode_id` (`postcode_id`),
  KEY `type` (`category`),
  KEY `opening_year` (`opening_year`),
  KEY `opening_month` (`opening_month`,`opening_year`) USING BTREE,
  KEY `opening_quarter` (`opening_quarter`,`opening_year`) USING BTREE,
  KEY `closing_year` (`closing_year`),
  KEY `closing_month` (`closing_year`,`closing_month`),
  KEY `closing_quarter` (`closing_year`,`closing_quarter`),
  KEY `closing_date` (`closing_date`),
  KEY `opening_closing_date` (`opening_date`,`closing_date`),
  KEY `opening_postcode` (`opening_postcode_id`),
  FULLTEXT KEY `film_name` (`film_name`)
) ENGINE=MyISAM AUTO_INCREMENT=10649173 DEFAULT CHARSET=utf8

邮编:

CREATE TABLE `postcodes` (
  `id_postcodes` int(4) NOT NULL AUTO_INCREMENT,
  `postcode` varchar(9) NOT NULL,
  `town_id` int(4) NOT NULL,
  `lat` float NOT NULL,
  `lng` float NOT NULL,
  PRIMARY KEY (`id_postcodes`),
  UNIQUE KEY `postcode` (`postcode`) USING BTREE,
  KEY `town` (`town_id`)
) ENGINE=MyISAM AUTO_INCREMENT=5705 DEFAULT CHARSET=latin1

城市:

CREATE TABLE `towns` (
  `id_towns` int(4) NOT NULL AUTO_INCREMENT,
  `town` varchar(150) NOT NULL,
  `county_id` int(3) NOT NULL,
  PRIMARY KEY (`id_towns`),
  KEY `county` (`county_id`)
) ENGINE=MyISAM AUTO_INCREMENT=1606 DEFAULT CHARSET=latin1

这是EXPLAIN EXTENDED输出:

1   PRIMARY <derived2>      ALL                                                                                                                     11      100 
1   PRIMARY <derived4>      ALL                                                                                                                     11      100     Using where; Using join buffer
4   DERIVED <derived5>      ALL                                                                                                                     11      100     Using where; Using temporary; Using filesort
4   DERIVED film_db         ref     postcode_id,closing_year,closing_month,closing_quarter  closing_year    5   uk.closing_year                     2       100     Using where
4   DERIVED postcodes       eq_ref  PRIMARY,town                                            PRIMARY         4   film_db.postcode_id                 1       100 
4   DERIVED towns           eq_ref  PRIMARY                                                 PRIMARY         4   postcodes.town_id                   1       100     Using where
5   DERIVED film_db         ALL     closing_year,closing_month,closing_quarter                                                                      9895680 47.66   Using where; Using temporary; Using filesort
2   DERIVED <derived3>      ALL                                                                                                                     11      100     Using where; Using temporary; Using filesort
2   DERIVED film_db         ref     opening_year,opening_postcode                           opening_year    5   uk.opening_year                     3       100     Using where
2   DERIVED postcodes       eq_ref  PRIMARY,town                                            PRIMARY         4   film_db.opening_postcode_id         1       100 
2   DERIVED towns           eq_ref  PRIMARY                                                 PRIMARY         4   postcodes.town_id                   1       100     Using where
3   DERIVED film_db         ALL     opening_year                                                                                                    9895680 54.53   Using where; Using temporary; Using filesort

如您所见,MySQL 认为对film_db表进行过滤不会产生任何性能差异,因此它不使用任何键。

所以:

我可以改进此查询以更好地使用索引吗?

我可以改进索引以使查询运行得更快吗?

是否应该使用另一种数据库类型(不是 MySQL)来代替这种查询,我最感兴趣的是计算具有复杂条件和连接的条目数?

4

1 回答 1

1

这是我要尝试的第一件事:

CREATE TABLE opens 
SELECT opening_year, COUNT(film_db.id_film_db) as number_of_films
FROM film_db
WHERE opening_year <= 2012 AND opening_year >= (2012 - 10)
GROUP BY opening_year

CREATE TABLE closures 
SELECT closing_year, COUNT(film_db.id_film_db) as number_of_films
FROM film_db
WHERE film_db.closing_year <= 2012 AND film_db.closing_year >= (2012 - 10)
GROUP BY film_db.closing_year

我将使用这两个表而不是您现在使用的子选择。

其他查询做类似的事情,有时在查询的末尾添加一个 UNION ALL 来获取开店或关店的记录年份。还有一些查询是针对月度数据和季度数据而不是年度数据运行的,还有一些查询只是比较特定季度(例如第三季度)或月份(例如三月)的历史开盘/关盘。

我认为您更频繁地运行这些选择,然后打开/关闭表的内容会改变。因此,每次运行此类查询时都不必重新构建这些表。


我可以改进此查询以更好地使用索引吗?我可以改进索引以使查询运行得更快吗?是否应该使用另一种数据库类型(不是 MySQL)来代替这种查询,我最感兴趣的是计算具有复杂条件和连接的条目数?

当然还有许多其他可能的改进。当然应该有一种方法让 MySQL 使用索引。您应该注意,数据库引擎不能组合单独的索引,也就是说,在这种情况下,index onopening_postcode_id和 index onopening_year不能组合。我不知道为什么它们都不使用,但我可以肯定地说像这两个这样的索引会改进这个查询

KEY `opening_year_postcode` (`opening_year`, `opening_postcode_id`)
KEY `closing_year_postcode` (`closing_year`, `postcode_id`)

看到这个答案https://stackoverflow.com/a/6295744/176569


这些年来我了解到,这种性能调优是一个渐进的过程。您将不得不尝试更多技巧,评估性能增益,最后您将只应用一两个技巧。

在这一点上,我不会考虑为其他数据库供应商放弃 MySQL。您的性能问题的原因可能不是 MySQL。

于 2012-08-24T10:35:21.690 回答