0

我有两个相同的表,除了一个有一个时间戳值列,另一个有一个日期时间值列。索引是一样的。价值观是一样的。

但是当我运行SELECT station, MAX(timestamp) AS max_timestamp FROM stations GROUP BY station;如果站是带有时间戳的站时,它执行得非常快,如果我尝试使用日期时间站,那么我还没有看到一个查询执行。在这两种情况下,timestamp列都被索引,只有类型发生变化。

我应该从哪里开始寻找?还是 datetime 不适合搜索和索引?

这是EXPLAIN给出的:

+----+-------------+-------+-------+---------------+-------+---------+------+------+--------------------------+
| id | select_type | table    | type  | possible_keys | key     | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+---------------+-------+---------+------+------+--------------------------+
|  1 | SIMPLE      | stations | range | NULL          | stamp   | 33      | NULL | 1511 | Using index for group-by |
+----+-------------+-------+-------+---------------+-------+---------+------+------+--------------------------+

+----+-------------+--------+-------+---------------+---------+---------+------+---------+-------+
| id | select_type | table    | type  | possible_keys | key     | key_len | ref  | rows    | Extra |
+----+-------------+--------+-------+---------------+---------+---------+------+---------+-------+
|  1 | SIMPLE      |stations2 | index | NULL          | station | 2       | NULL | 3025467 |       |
+----+-------------+--------+-------+---------------+---------+---------+------+---------+-------+

SHOW

+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| stations | CREATE TABLE `stations` (
  `station` varchar(10) COLLATE utf8_bin DEFAULT NULL,
  `available` smallint(6) DEFAULT NULL,
  `timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  UNIQUE KEY `stamp` (`station`,`timestamp`),
  KEY `time` (`timestamp`),
  KEY `timestamp` (`timestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| stations2 | CREATE TABLE `stations2` (
  `station` smallint(5) unsigned NOT NULL,
  `available` smallint(5) unsigned DEFAULT NULL,
  `timestamp` datetime DEFAULT NULL,
  KEY `station` (`station`),
  KEY `timestamp` (`timestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin |
+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4

1 回答 1

1

您可以从 EXPLAIN 中看到没有用于选择的键(可能的键为 NULL)。你没有 WHERE 子句,所以这是有道理的。

MySQL 可以利用索引来确定 MAX,它可以利用索引来优化 GROUP BY。但是,为了能够优化两者的组合,您需要 MAX() 函数中的列和 GROUP BY 子句中的列都位于复合索引中。在第一个表中,您将此复合索引作为称为“stamp”的唯一键。EXPLAIN 结果显示 MySQL 正在使用该索引。

在第二个表上,您没有此复合索引,因此 MySQL 必须执行更多工作。它必须手动对结果进行分组,并通过手动扫描每一行来保持每个站点的 MAX 值。如果您在第二个表上添加相同的复合索引,您将看到两者之间的性能相似。

但是,TIMESTAMP 仍将略胜于 DATETIME,因为 TIMESTAMP 被视为单个 4 字节整数值,其处理速度比 8 字节特殊 DATETIME 值快。数据集越大,您将看到的差异越大。

于 2012-05-25T16:58:07.547 回答