1

假设我有两张桌子。

table one:

| col1 |
- - - - -
| do   |
| big  |
| gone |

table two

| col1 | col2 | col3 | col4 |
- - - - - - - - - - - - - - -
| do   | blah | blah | big  |
| big  | do   | blah | gone |
| blah | blah | blah | blah |

我如何从table two显示的行中搜索包含的所有col1table one

例如。给定情况的结果应该是

| col1 | col2 | col3 | col4 |
- - - - - - - - - - - - - - -
| big  | do   | blah | gone |
4

4 回答 4

1

讨厌的问题...

SELECT two.*
  FROM two
 WHERE (SELECT COUNT(*) FROM one) =
       (CASE WHEN col1 IN (SELECT * FROM one) THEN 1 ELSE 0 END +
        CASE WHEN col2 IN (SELECT * FROM one) THEN 1 ELSE 0 END +
        CASE WHEN col3 IN (SELECT * FROM one) THEN 1 ELSE 0 END +
        CASE WHEN col4 IN (SELECT * FROM one) THEN 1 ELSE 0 END
       )

不应与此查询一起提及“效率”一词。

于 2013-06-21T23:48:11.010 回答
1

也许最棘手的部分是保证所有列都包含在第二个表中。仅仅计算它们是不够的,您还必须确定所有都是集合:

select t.*
from two t left outer join
     one o1
     on o1.col1 = t.col1 left outer join
     one o2
     on o2.col1 = t.col2 and o2.col1 not in (coalesce(t.col1, '')) left outer join
     one o3
     on o3.col1 = t.col3 and o3.col1 not in (coalesce(t.col1, ''), coalesce(t.col2, '')) left outer join
     one o4
     on o4.col1 = t.col4 and o4.col1 not in (coalesce(t.col1, ''), coalesce(t.col2, ''), coalesce(t.col3, '')) cross join
     (select count(*) as cnt from one) const
where const.cnt = ((case when o1.col1 is not null then 1 else 0 end) +
                   (case when o2.col1 is not null then 1 else 0 end) +
                   (case when o3.col1 is not null then 1 else 0 end) +
                   (case when o4.col1 is not null then 1 else 0 end)
                  )

这会查找one表中的每个值,但前提是该值以前从未见过。如果one表中有重复项,则存在如何处理它们的问题。这是否意味着该值必须出现那么多次?

于 2013-06-22T00:01:55.890 回答
1

这假设 CTE 和位操作(左移和 OR)在 postgres 中可用(也可能存在于其他 DBMS 中)

WITH rnk AS (
    SELECT col1, (rank() OVER (ORDER BY col1))::integer AS rnk
    FROM one
    )
, five AS (
    SELECT t.*
            , 0::integer
            | COALESCE( 1<< o1.rnk, 0)
            | COALESCE( 1<< o2.rnk, 0)
            | COALESCE( 1<< o3.rnk, 0)
            | COALESCE( 1<< o4.rnk, 0)
            AS mask
    FROM two t
    LEFT JOIN rnk o1 ON o1.col1 = t.col1
    LEFT JOIN rnk o2 ON o2.col1 = t.col2
    LEFT JOIN rnk o3 ON o3.col1 = t.col3
    LEFT JOIN rnk o4 ON o4.col1 = t.col4
    )
SELECT * FROM five f5
WHERE f5.mask IN (14)
    ;

更新:这个可能更干净一些,因为它隐藏了 CTE 内的位移。

WITH xrnk AS (
    SELECT col1, 1::integer << (rank() OVER (ORDER BY col1))::integer AS xrnk
    FROM one
    )
, five AS (
    SELECT t.*
        , ( COALESCE( o1.xrnk, 0)
          | COALESCE( o2.xrnk, 0)
          | COALESCE( o3.xrnk, 0)
          | COALESCE( o4.xrnk, 0)
          ) >> 1
        AS mask
    FROM two t
    LEFT JOIN xrnk o1 ON o1.col1 = t.col1
    LEFT JOIN xrnk o2 ON o2.col1 = t.col2
    LEFT JOIN xrnk o3 ON o3.col1 = t.col3
    LEFT JOIN xrnk o4 ON o4.col1 = t.col4
    )
SELECT * FROM five f5
WHERE f5.mask IN (7)
    ;

最简单的解决方案总是最好的:

SELECT * FROM two t
WHERE NOT EXISTS (
        SELECT * FROM one o
        WHERE o.col1 <> t.col1 AND o.col1 <> t.col2
          AND o.col1 <> t.col3 AND o.col1 <> t.col4
        )
        ;

更新:(感谢@dbenham)简单查询对two表中的 NULL 相当敏感,必须由一堆COALESCE()包装器处理。显然,'XxxX' 字面量永远不会匹配:

SELECT * FROM two t
WHERE NOT EXISTS (
        SELECT * FROM one o
        WHERE o.col1 <> COALESCE(t.col1, 'XxxX' )
          AND o.col1 <> COALESCE(t.col2, 'XxxX' )
          AND o.col1 <> COALESCE(t.col3, 'XxxX' )
          AND o.col1 <> COALESCE(t.col4, 'XxxX' )
        )
        ;
于 2013-06-22T15:19:58.150 回答
0

您没有说明您正在使用什么 SQL 引擎 - 它可以有所作为。

我提供了一个需要支持 row_number() 函数的解决方案。我相信至少 Oracle、DB2 和 SQLServer 都支持 row_number()。

一旦将表一中的不同值转换为单行,问题就相当简单了。如果表一中存在超过 4 个不同的值,则不能有任何匹配项。似乎应该有更好的方法来进行旋转,但我知道这个解决方案有效。

如果一个为空,我已经努力确保答案返回两个中的所有行,并且忽略一个中的重复行。

with 
uniqueOne as ( 
  select distinct col1 from one 
),
ranked as (
  select col1, row_number() over (order by col1) seq from uniqueOne
),
vals as (
  select t1.col1 val1, 
         t2.col1 val2, 
         t3.col1 val3, 
         t4.col1 val4
    from (select 1 dummy) dummy
    left join ranked t1 on t1.seq=1
    left join ranked t2 on t2.seq=2
    left join ranked t3 on t3.seq=3
    left join ranked t4 on t4.seq=4
    left join ranked t5 on t5.seq=5
   where t5.seq is null
)
select two.*
  from two
 cross join vals
 where (vals.val1 is null or vals.val1 in (two.col1, two.col2, two.col3, two.col4))
   and (vals.val2 is null or vals.val2 in (two.col1, two.col2, two.col3, two.col4))
   and (vals.val3 is null or vals.val3 in (two.col1, two.col2, two.col3, two.col4))
   and (vals.val4 is null or vals.val4 in (two.col1, two.col2, two.col3, two.col4))
;


这是解决方案的现场演示


天哪,我想我应该阅读自己的答案并更频繁地进行一些研究。SQLServer 有一个 Pivot 运算符,使解决方案非常高效。Oracle 也有 Pivot,但它使用不同的语法。

这是SQLServer Pivot 解决方案的工作演示。看看甜蜜的执行计划。

这是 SQLServer 查询:

with 
uniqueOne as ( 
  select distinct col1 from one 
),
ranked as (
  select col1, row_number() over (order by col1) seq from uniqueOne
),
vals as (
  select [1] val1, [2] val2, [3] val3, [4] val4, [5] val5
    from ranked
    pivot ( min(col1) for seq in ([1],[2],[3],[4],[5]) ) PivotTable
)
select two.*
  from two
  join vals on val5 is null
 where (vals.val1 is null or vals.val1 in (two.col1, two.col2, two.col3, two.col4))
   and (vals.val2 is null or vals.val2 in (two.col1, two.col2, two.col3, two.col4))
   and (vals.val3 is null or vals.val3 in (two.col1, two.col2, two.col3, two.col4))
   and (vals.val4 is null or vals.val4 in (two.col1, two.col2, two.col3, two.col4))
;
于 2013-06-22T01:41:32.160 回答