1

我需要在表的列中找到所有唯一可能的值组合。例如,对于列值 1、2、3、4、5。我希望结果是 [1,2],[1,3],[1,4],[1,5],[2,1],[2,3] 等。

将欣赏任何构造查询以查找值组合的指针。

谢谢

4

2 回答 2

2

您可以通过使用添加常量键值的子选择在 BigQuery 中执行交叉连接,然后加入该常量值。

例如,下面的查询将计算 {1, 2, 3} 和 {2, 4, 6} 的交叉连接:

SELECT t1.num as first, t2.num as second 
FROM (
    SELECT num, 1 as key 
    FROM (
        SELECT 1 as num), (
        SELECT 2 as num), (
        SELECT 3 as num)) as t1
JOIN (
    SELECT num, 1 as key 
    FROM (
        SELECT 2 as num), (
        SELECT 4 as num), (
        SELECT 6 as num)) as t2
ON t1.key = t2.key
WHERE t1.num <> t2.num

请注意,这使用 BigQuery“技巧”来创建两个输入表。如果您只是对现有表执行此操作,它将如下所示:

SELECT t1.num as first, t2.num as second 
FROM (
    SELECT foo as num, 1 as key 
    FROM [my_dataset.my_table]) as t1
JOIN (
    SELECT foo as num, 1 as key 
    FROM [my_dataset.my_table]) as t2
ON t1.key = t2.key
WHERE t1.num <> t2.num
于 2013-09-28T17:27:55.183 回答
1

Across join可能有用。

请参阅此演示:http

://www.sqlfiddle.com/#!12/59af5/ 1 ANSI SQL 语法使用CROSS JOIN运算符:

create table val( x int );
insert into val values(1),(2),(3),(4),(5);

SELECT a.x a, b.x b
FROM val a
CROSS JOIN val b
WHERE a.x <> b.x
ORDER BY a,b;



此查询的另一种形式不CROSS JOIN应该适用于大多数 DBMS 系统,但为了清晰起见,建议使用 ANSI 形式:

SELECT a.x a, b.x b
FROM val a, val b
WHERE a.x <> b.x
ORDER BY a,b;


请注意,大型数据集的交叉连接会影响您的数据库性能,对于 100 个值,它会生成 100x100 = 10.000 行,对于 1000 --> 1.000.000 行。

于 2013-09-28T08:26:26.250 回答