2

我正在使用下表:

1   0051ML66220600132482    06:00:00        06:00:00        1538    100 0   1
2   0051ML66220600132482    06:00:00        06:00:00        1540    200 0   0
3   0051ML66220600132482    06:00:00        06:00:00        1541    300 0   0
4   0051ML66220600132482    06:01:00        06:01:00        1542    400 0   0
5   0051ML66220600132482    06:01:00        06:01:00        1543    500 0   0
6   0051ML66220600132482    06:02:00        06:02:00        1544    600 0   0
7   0051ML66220600132482    06:03:00        06:03:00        1546    700 0   0

我们的表结构如下:

> ------------------------------------------------------------------
> --  TABLE stop_times
> ------------------------------------------------------------------
> 
> CREATE TABLE stop_times ( id int(12),
>                           trip_id varchar(100),
>                           arrival_time varchar(8),
>                           arrival_time_seconds int(100),
>                           departure_time varchar(8),
>                           departure_time_seconds int(100),
>                           stop_id varchar(100),
>                           stop_sequence varchar(100),
>                           pickup_type varchar(2),
>                           drop_off_type varchar(2) );

我正在尝试获取 DISTINCTtrip_id是否匹配目的地和到达stop_id

我尝试了以下 SQL,但没有成功:

select DISTINCT trip_id from stop_times where stop_id=1538 AND stop_id =1540;

应该在哪里产生:0051ML66220600132482

我还尝试了如下的 INNER JOIN SQL:

SELECT 
       t.trip_id,
       start_s.stop_name as departure_stop,
       end_s.stop_name as arrival_stop
FROM
trips t 
        INNER JOIN stop_times start_st ON t.trip_id = start_st.trip_id
        INNER JOIN stops start_s ON start_st.stop_id = start_s.stop_id
        INNER JOIN stop_times end_st ON t.trip_id = end_st.trip_id
        INNER JOIN stops end_s ON end_st.stop_id = end_s.stop_id
WHERE 
   start_s.stop_id = 1538 
  AND end_s.stop_id = 1540;

但它太慢了,这个简单的查询大约需要 8-15 秒。

解释补充:

在此处输入图像描述

进行此查询的最快/最佳方法是什么?

4

2 回答 2

3

因此,换句话说,您正在寻找一个查询,该查询将识别通过一对停靠点、起点(起点)和终点(终点)的所有行程。

试试这个查询:

SELECT destination.trip_id
    FROM stop_times AS origin
    INNER JOIN stop_times AS destination
        ON destination.trip_id = origin.trip_id
        AND destination.stop_id = 1540
    WHERE origin.stop_id = 1538
        AND origin.stop_sequence < destination.stop_sequence;

或者,为了更漂亮的视图(并匹配您问题中的第二个查询):

SELECT destination.trip_id, origin_stop.name, destination_stop.name
    FROM stop_times AS origin
    INNER JOIN stop_times AS destination
        ON destination.trip_id = origin.trip_id
        AND destination.stop_id = 1540
    INNER JOIN stops AS origin_stop
        ON origin_stop.id = origin.stop_id
    INNER JOIN stops AS destination_stop
        ON destination_stop.id = destination.stop_id
    WHERE origin.stop_id = 1538
        AND origin.stop_sequence < destination.stop_sequence;

stop_id为了获得良好的性能,请先在和上创建索引trip_id

CREATE INDEX stop_times_stop_id_trip_id_index ON stop_times(stop_id, trip_id);

(请注意,EternalHour 的查询会识别经过一站点的所有行程,而不仅仅是先经过一个站点然后经过另一个站点的行程。)

于 2015-04-30T09:17:55.733 回答
1

似乎这是您需要的查询。我取出DISTINCT并替换为GROUP BY,也替换WHEREIN. 您的查询表明stop_id应该是 type INT,而不是varchar因为您没有在其中添加引号,所以小提琴反映了这一点。

IN基本上做一个而OR不是一个ANDAND由于两个stop_id' 不在同一行中,因此不会返回任何内容。

SELECT trip_id 
FROM stop_times 
WHERE stop_id IN(1538,1540)
GROUP BY trip_id

这是一个SQLFiddle

于 2015-04-30T08:55:52.197 回答