0

我还是 SQL 的新手,我正在努力提高我的查询性能。我一直在四处寻找并得出结论,使用 JOINS 而不是这么多 WHERE INS 将有助于提高我的性能,但我不确定如何转换我的语句。这是我目前的声明。

SELECT stop_id, stop_name FROM stops WHERE stop_id IN (
       SELECT DISTINCT stop_id FROM stop_times WHERE trip_id IN (
              SELECT trip_id from trips WHERE route_id = <routeid> ));

返回结果需要 5-25 秒,这是不可接受的。我希望它低于 1 秒。如果有人想知道数据来自 GTFS 提要。stop 和 trips 表各有约 10,000 行,而 stop_times 表有约 900,000 行。我在我使用的每一列都创建了索引。这是 EXPLAIN 的输出,以及用于创建每个表的内容。

感谢您的帮助,如果您需要更多信息,请告诉我!

+----+--------------------+------------+-----------------+------------------+---------+---------+------+------+-------------+
| id | select_type        | table      | type            | possible_keys    | key     | key_len | ref  | rows | Extra       |
+----+--------------------+------------+-----------------+------------------+---------+---------+------+------+-------------+
|  1 | PRIMARY            | stops      | ALL             | NULL             | NULL    | NULL    | NULL | 6481 | Using where |
|  2 | DEPENDENT SUBQUERY | stop_times | index_subquery  | stop_id          | stop_id | 63      | func |   63 | Using where |
|  3 | DEPENDENT SUBQUERY | trips      | unique_subquery | PRIMARY,route_id | PRIMARY | 62      | func |    1 | Using where |
+----+--------------------+------------+-----------------+------------------+---------+---------+------+------+-------------+


| stops | CREATE TABLE `stops` (
  `stop_id` varchar(20) NOT NULL,
  `stop_code` varchar(50) DEFAULT NULL,
  `stop_name` varchar(255) DEFAULT NULL,
  `stop_desc` varchar(255) DEFAULT NULL,
  `stop_lat` decimal(8,6) DEFAULT NULL,
  `stop_lon` decimal(8,6) DEFAULT NULL,
  `zone_id` int(11) DEFAULT NULL,
  `stop_url` varchar(255) DEFAULT NULL,
  `location_type` int(2) DEFAULT NULL,
  `parent_station` int(11) DEFAULT NULL,
  `wheelchair_boarding` int(2) DEFAULT NULL,
  PRIMARY KEY (`stop_id`),
  KEY `zone_id` (`zone_id`),
  KEY `stop_lat` (`stop_lat`),
  KEY `stop_lon` (`stop_lon`),
  KEY `location_type` (`location_type`),
  KEY `parent_station` (`parent_station`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |


| stop_times | CREATE TABLE `stop_times` (
  `trip_id` varchar(20) DEFAULT NULL,
  `arrival_time` varchar(8) DEFAULT NULL,
  `arrival_time_seconds` int(11) DEFAULT NULL,
  `departure_time` varchar(8) DEFAULT NULL,
  `departure_time_seconds` int(11) DEFAULT NULL,
  `stop_id` varchar(20) DEFAULT NULL,
  `stop_sequence` int(11) DEFAULT NULL,
  `stop_headsign` varchar(50) DEFAULT NULL,
  `pickup_type` int(2) DEFAULT NULL,
  `drop_off_type` int(2) DEFAULT NULL,
  `shape_dist_traveled` varchar(50) DEFAULT NULL,
  KEY `trip_id` (`trip_id`),
  KEY `arrival_time_seconds` (`arrival_time_seconds`),
  KEY `departure_time_seconds` (`departure_time_seconds`),
  KEY `stop_id` (`stop_id`),
  KEY `stop_sequence` (`stop_sequence`),
  KEY `pickup_type` (`pickup_type`),
  KEY `drop_off_type` (`drop_off_type`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |

| trips | CREATE TABLE `trips` (
  `route_id` varchar(20) DEFAULT NULL,
  `service_id` varchar(20) DEFAULT NULL,
  `trip_id` varchar(20) NOT NULL,
  `trip_headsign` varchar(255) DEFAULT NULL,
  `trip_short_name` varchar(255) DEFAULT NULL,
  `direction_id` tinyint(1) DEFAULT NULL,
  `block_id` int(11) DEFAULT NULL,
  `shape_id` varchar(50) DEFAULT NULL,
  PRIMARY KEY (`trip_id`),
  KEY `route_id` (`route_id`),
  KEY `service_id` (`service_id`),
  KEY `direction_id` (`direction_id`),
  KEY `block_id` (`block_id`),
  KEY `shape_id` (`shape_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
4

1 回答 1

3

您认为 JOINS 通常比 WHERE IN 子查询更快是正确的。

尝试这个:

SELECT T3.stop_id, T3.stop_name 
FROM trips AS T1
JOIN
stop_times AS T2
ON T1.trip_id=T2.trip_id AND route_id = <routeid>
JOIN stops AS T3
ON T2.stop_id=T3.stop_id
GROUP BY T3.stop_id, T3.stop_name
于 2013-07-04T20:10:28.070 回答