0

我有一个查询给了我错误的结果。

表:

A
+----+
| id |
+----+
|  1 |
|  2 |
+----+

B
+----+----+
| id |  x |  B.id = A.id
+----+----+
|  1 |  1 |
|  1 |  1 |
|  1 |  0 |
+----+----+

C
+----+----+
| id |  y |  C.id = A.id
+----+----+
|  1 |  1 |
|  1 |  2 |
+----+----+

我想做的事情:从A中选择所有行。对于A中的每一行,B中的所有x都为1,所有x的值为0,B.id = A.id。对于 A 中的每一行,使用 C.id = A.id 从 C 中获取最小 y。

我期待的结果是:

+----+------+--------+---------+
| id |  min | count1 | count 2 |
+----+------+--------+---------+
|  1 |    1 |      2 |       1 |
|  2 | NULL |      0 |       0 |
+----+------+--------+---------+

第一次尝试:这不起作用。

SELECT a.id,
       MIN(c.y),
       SUM(IF(b.x = 1, 1, 0)),
       SUM(IF(b.x = 0, 1, 0))
FROM   a
       LEFT JOIN b
              ON ( a.id = b.id )
       LEFT JOIN c
              ON ( a.id = c.id )
GROUP BY a.id

+----+------+--------+---------+
| id |  min | count1 | count 2 |
+----+------+--------+---------+
|  1 |    1 |      4 |       2 |
|  2 | NULL |      0 |       0 |
+----+------+--------+---------+

第二次尝试:这可行,但我确信它的性能很差。

SELECT a.id,
       MIN(c.y),
       b.x,
       b.y
FROM   a
       LEFT JOIN (SELECT b.id, SUM(IF(b.x = 1, 1, 0)) x, SUM(IF(b.x = 0, 1, 0)) y FROM b) b
              ON ( a.id = b.id )
       LEFT JOIN c
              ON ( a.id = c.id )
GROUP BY a.id

+----+------+--------+---------+
| id |  min | count1 | count 2 |
+----+------+--------+---------+
|  1 |    1 |      2 |       1 |
|  2 | NULL |      0 |       0 |
+----+------+--------+---------+

最后一次尝试:这也有效。

SELECT x.*,
       SUM(IF(b.x = 1, 1, 0)),
       SUM(IF(b.x = 0, 1, 0))
FROM   (SELECT a.id,
               MIN(c.y)
        FROM   a
               LEFT JOIN c
                      ON ( a.id = c.id )
        GROUP  BY a.id) x
       LEFT JOIN b
              ON ( b.id = x.id )
GROUP  BY x.id

现在我的问题是:最后一个是最好的选择,还是有办法只用一个 select 语句来编写这个查询(就像第一次尝试一样)?

4

1 回答 1

3

您的联接正在对给定值进行笛卡尔积,因为每个表中有多行。

您可以通过使用count(distinct)而不是解决此问题sum()

SELECT a.id, MIN(c.y),
       count(distinct (case when b.x = 1 then b.id end)),
       count(distinct (case when b.x = 0 then b.id end))
FROM   a
       LEFT JOIN b
              ON ( a.id = b.id )
       LEFT JOIN c
              ON ( a.id = c.id )
GROUP BY a.id;

您也可以通过预先聚合b(和/或c)来解决此问题。如果您的聚合函数类似于b.

编辑:

你是对的。上述查询计算 的不同值B,但B包含完全重复的行。(就个人而言,我认为有一个名称id重复的列是设计不佳的标志,但这是另一个问题。)

id您可以通过在表中添加一个实数来解决它b,因为这样count(distinct)会计算正确的值。您还可以通过在加入两个表之前聚合它们来解决它:

SELECT a.id, c.y, x1, x0
FROM   a
       LEFT JOIN (select b.id,
                         sum(b.x = 1) as x1,
                         sum(b.x = 0) as x0
                  from b
                  group by b.id
                 ) b
              ON ( a.id = b.id )
       LEFT JOIN (select c.id, min(c.y) as y
                  from c
                  group by c.id
                 ) c
              ON ( a.id = c.id );

是该问题的SQL Fiddle。

编辑二:

你可以在一个声明中得到它,但我不太确定它是否适用于类似的数据。这个想法是,您可以计算所有情况,x = 1然后除以C表中的行数以获得真正的不同计数:

SELECT a.id, MIN(c.y), 
       coalesce(sum(b.x = 1), 0) / count(distinct coalesce(c.y, -1)), 
       coalesce(sum(b.x = 0), 0) / count(distinct coalesce(c.y, -1))
FROM   a
       LEFT JOIN b
              ON ( a.id = b.id )
       LEFT JOIN c
              ON ( a.id = c.id )
GROUP BY a.id;

这有点棘手,因为您必须处理NULLs 才能获得正确的值。请注意,这是对值进行计数以从表y中获得不同的计数。C您的问题再次说明了为什么在每个表中都有一个唯一的整数主键是一个好主意。

于 2013-08-15T14:49:32.160 回答