mysql - 为什么mysql和sqlite之间的SELECT结果不同？

Question

我以简化和扩展的方式重新提出这个问题。

考虑这些 sql 语句：

create table foo (id INT, score INT);

insert into foo values (106, 4);
insert into foo values (107, 3);
insert into foo values (106, 5);
insert into foo values (107, 5);

select T1.id, avg(T1.score) avg1
from foo T1
group by T1.id
having not exists (
    select T2.id, avg(T2.score) avg2
    from foo T2
    group by T2.id
    having avg2 > avg1);

使用 sqlite，select语句返回：

id          avg1      
----------  ----------
106         4.5       
107         4.0

和 mysql 返回：

+------+--------+
| id   | avg1   |
+------+--------+
|  106 | 4.5000 |
+------+--------+

据我所知，mysql 的结果是正确的，而 sqlite 的结果是不正确的。我尝试real使用 sqlite 进行转换，如下所示，但它仍然返回两条记录：

select T1.id, cast(avg(cast(T1.score as real)) as real) avg1
from foo T1
group by T1.id
having not exists (
    select T2.id, cast(avg(cast(T2.score as real)) as real) avg2
    from foo T2
    group by T2.id
    having avg2 > avg1);

为什么 sqlite 返回两条记录？

快速更新：

我针对最新的 sqlite 版本（3.7.11）运行了该语句，仍然得到两条记录。

另一个更新：

我就这个问题向 sqlite-users@sqlite.org 发送了一封电子邮件。

我自己，我一直在玩 VDBE，发现了一些有趣的东西。我拆分了每个循环的执行跟踪not exists（每个平均组一个）。

为了拥有三个平均组，我使用了以下语句：

create table foo (id VARCHAR(1), score INT);

insert into foo values ('c', 1.5);
insert into foo values ('b', 5.0);
insert into foo values ('a', 4.0);
insert into foo values ('a', 5.0);

PRAGMA vdbe_listing = 1;
PRAGMA vdbe_trace=ON;

select avg(score) avg1
from foo
group by id
having not exists (
    select avg(T2.score) avg2
    from foo T2
    group by T2.id
    having avg2 > avg1);

我们清楚地看到，应该是什么r:4.5已经变成了i:5：

在此处输入图像描述

我现在想看看为什么会这样。

最终编辑：

所以我已经玩够了 sqlite 源代码。我现在更好地理解了这头野兽，尽管我会让原始开发人员对其进行整理，因为他似乎已经这样做了：

http://www.sqlite.org/src/info/430bb59d79

有趣的是，至少对我来说，似乎较新的版本（有时在我使用的版本之后）支持插入在上述提交中添加的测试用例中使用的多条记录：

CREATE TABLE t34(x,y);
INSERT INTO t34 VALUES(106,4), (107,3), (106,5), (107,5);

score 1 · Accepted Answer

让我们看看这两种方式，我将使用 postgres 9.0 作为我的参考数据库

(1)

-- select rows from foo 

select T1.id, avg(T1.score) avg1
from foo T1
group by T1.id
-- where we don't have any rows from T2
having  not exists (
-- select rows from foo
select T2.id, avg(T2.score) avg2
from foo T2
group by T2.id
-- where the average score for any row is greater than the average for 
-- any row in T1
having avg2 > avg1);

 id  |        avg1        
-----+--------------------
 106 | 4.5000000000000000
(1 row)

然后让我们移动子查询中的一些逻辑，摆脱'not'：（2）

-- select rows from foo 
select T1.id, avg(T1.score) avg1
from foo T1
group by T1.id
-- where we do have rows from T2
having  exists (
-- select rows from foo
select T2.id, avg(T2.score) avg2
from foo T2
group by T2.id
-- where the average score is less than or equal than the average for any row in T1
having avg2 <= avg1);
-- I think this expression will be true for all rows as we are in effect doing a
--cartesian join 
-- with the 'having' only we don't display the cartesian row set

 id  |        avg1        
-----+--------------------
 106 | 4.5000000000000000
 107 | 4.0000000000000000
(2 rows)

所以你必须问自己——当你在有子句中执行这个相关的子查询时，你实际上是什么意思，如果它根据主查询中的每一行评估每一行，我们正在做一个笛卡尔连接，我不认为我们应该指责SQL引擎。

如果您希望每一行都小于最大平均值您应该说的是：

select T1.id, avg(T1.score) avg1 
from foo T1 group by T1.id
having avg1 not in 
(select max(avg1) from (select id,avg(score) avg1 from foo group by id))

score 1 · Accepted Answer

我试图弄乱一些查询变体。

似乎 sqlite 在嵌套HAVING表达式中使用先前声明的字段时出错。

在你的例子avg1中，第二个总是等于 5.0

看：

select T1.id, avg(T1.score) avg1
from foo T1
group by T1.id
having not exists (
    SELECT 1 AS col1 GROUP BY col1 HAVING avg1 = 5.0);

这个不返回任何内容，但执行以下查询会返回两条记录：

...
having not exists (
    SELECT 1 AS col1 GROUP BY col1 HAVING avg1 <> 5.0);

我在sqlite 票务列表中找不到任何类似的错误。

score 0 · Accepted Answer

你试过这个版本吗？：

select T1.id, avg(T1.score) avg1
from foo T1
group by T1.id
having not exists (
    select T2.id, avg(T2.score) avg2
    from foo T2
    group by T2.id
    having avg(T2.score) > avg(T1.score));

还有这个（应该给出相同的结果）：

select T1.*
from
  ( select id, avg(score) avg1
    from foo 
    group by id
  ) T1
where not exists (
    select T2.id, avg(T2.score) avg2
    from foo T2
    group by T2.id
    having avg(T2.score) > avg1);

查询也可以使用派生表来处理，而不是子查询 inHAVING子句：

select ta.id, ta.avg1
from 
  ( select id, avg(score) avg1
    from foo
    group by id
  ) ta
  JOIN
  ( select avg(score) avg1
    from foo 
    group by id
    order by avg1 DESC
    LIMIT 1
  ) tmp
  ON tmp.avg1 = ta.avg1

mysql - 为什么mysql和sqlite之间的SELECT结果不同？

3 回答 3

Related

Reference