1

I'm currently working on a project which is supposed to take some arbitrary input data, do some natural language processing, and dynamically generate a corresponding SQL query. I also have a "reference" set of SQL queries which I can use to compare my SQL against to verify that the SQL generation is accurate.

This is one such SQL query that I've generated:

SELECT DISTINCT t0.airline_code 
FROM ( 
    SELECT airline.* 
    FROM airline, flight 
    WHERE ( 
        ( airline.airline_code = flight.airline_code ) 
        AND 
        ( flight.flight_days = 'DAILY' ) 
    ) 
) 
AS t0 
INNER JOIN ( 
    SELECT airline.* 
    FROM airline, flight, airport_service, city 
    WHERE ( 
        ( airline.airline_code = flight.airline_code ) 
        AND 
        ( flight.from_airport = airport_service.airport_code ) 
        AND 
        ( airport_service.city_code = city.city_code ) 
        AND 
        ( city.city_name = 'BOSTON' ) 
    ) 
) 
AS t1 
ON t0.airline_code = t1.airline_code 
INNER JOIN ( 
    SELECT airline.* 
    FROM airline, flight, airport_service, city 
    WHERE ( 
        ( airline.airline_code = flight.airline_code ) 
        AND 
        ( flight.to_airport = airport_service.airport_code ) 
        AND 
        ( airport_service.city_code = city.city_code ) 
        AND 
        ( city.city_name = 'DALLAS' ) 
    ) 
) 
AS t2 
ON t1.airline_code = t2.airline_code;

Running this returns the following columns:

airline_code
------------
AA 
CO 
HP 
TW 
DL 
NW 
UA 
US 

The reference SQL, however, returns slightly different results:

SELECT DISTINCT airline.airline_code
FROM airline
WHERE airline.airline_code IN
        (SELECT flight.airline_code
         FROM flight
         WHERE (flight.flight_days = 'DAILY'
                AND (flight.from_airport IN
                         (SELECT airport_service.airport_code
                          FROM airport_service
                          WHERE airport_service.city_code IN
                                  (SELECT city.city_code
                                   FROM city
                                   WHERE city.city_name = 'BOSTON'))
                     AND flight.to_airport IN
                         (SELECT airport_service.airport_code
                          FROM airport_service
                          WHERE airport_service.city_code IN
                                  (SELECT city.city_code
                                   FROM city
                                   WHERE city.city_name = 'DALLAS')))));

Result:

airline_code
------------
AA 
DL 
TW 
UA 
US 

Obviously, the two are different in that the first one is using joins while the second one uses nested SQL statements. However, this doesn't appear to be causing any problems for the other generated SQL/reference SQL that I'm working with, which are structured similarly (the generated SQL uses joins, the reference SQL is nested).

I'm fairly new to SQL, and know next to nothing about databases, so I could be missing something stupidly obvious, but for the life of me, I can't see why the two SQL statements are returning different results. They seem functionally identical, as best as I can tell. Does anybody know what I'm doing wrong, and how I can fix the generated SQL to match the reference?

If it matters, I'm using Microsoft SQL Server 2012.

4

1 回答 1

1

bksi是对的,问题出在第一个查询中。

看:你会在第一次查询中得到所有每天都有航班的公司。

然后您RIGHT JOIN的公司有从波士顿出发的航班 - 这意味着您现在选择了每天(从任何地方)和(任何时间)从波士顿出发的航班,但不完全是从波士顿出发的每日航班。

是的,第三次加入为您提供了同时有每日航班、从波士顿出发和飞往达拉斯的航班的公司。

第二个查询,带有嵌套语句,提供您唯一的公司,每天都有从波士顿到达拉斯的航班。

于 2013-07-30T06:25:44.647 回答