1

早上好,

我正在尝试在大查询中转置一些数据。我看过其他一些在 stackoverflow 上问过这个问题的人,但这样做的方法似乎是使用旧版 sql(使用 group_concat_unquoted)而不是标准 sql。我会使用旧版,但我过去曾遇到过嵌套数据的问题,因此只使用了标准。

这是我的示例,为了提供一些上下文,我正在尝试绘制一些客户旅程,如下所示:

uniqueid | page_flag | order_of_pages
A        | Collection|   1
A        | Product   |   2
A        | Product   |   3
A        | Login     |   4
A        | Delivery  |   5
B        | Clearance |   1
B        | Search    |   2
B        | Product   |   3
C        | Search    |   1
C        | Collection|   2
C        | Product   |   3

但是我想转置数据,使其看起来像这样:

uniqueid | 1          | 2          | 3       | 4     | 5 
A        | Collection | Product    | Product | Login | Delivery
B        | Clearance  | Search     | Product | NULL  | NULL
C        | Search     | Collection | Product | NULL  | NULL

我尝试使用多个左连接,但出现以下错误:

select a.uniqueid, 
b.page_flag as page1,
c.page_flag as page2,
d.page_flag as page3,
e.page_flag as page4,
f.page_flag as page5

from

(select distinct uniqueid, 
(case when uniqueid is not null then 1 end) as page_hit1,
(case when uniqueid is not null then 2 end) as page_hit2,
(case when uniqueid is not null then 3 end) as page_hit3,
(case when uniqueid is not null then 4 end) as page_hit4,
(case when uniqueid is not null then 5 end) as page_hit5
from `mytable`) a

LEFT JOIN (
SELECT *
from `mytable`) b on a.uniqueid = b.uniqueid
and a.page_hit1 = b.order_of_pages


LEFT JOIN (
SELECT *
from `mytable`) c on a.uniqueid = c.uniqueid
and a.page_hit2 = c.order_of_pages


LEFT JOIN (
SELECT *
from `mytable`) d on a.uniqueid = d.uniqueid
and a.page_hit3 = d.order_of_pages


LEFT JOIN (
SELECT *
from `mytable`) e on a.uniqueid = e.uniqueid
and a.page_hit4 = e.order_of_pages


LEFT JOIN (
SELECT *
from `mytable`) f on a.uniqueid = f.uniqueid
and a.page_hit5 = f.order_of_pages



Error: Query exceeded resource limits for tier 1. Tier 13 or higher required.

我也研究过使用 Array 函数,但我以前从未使用过它,我不确定这是否只是为了反过来转置。任何建议都会很重要。

谢谢

4

1 回答 1

6

对于 BigQuery 标准 SQL

#standardSQL
SELECT 
  uniqueid,
  MAX(IF(order_of_pages = 1, page_flag, NULL)) AS p1,
  MAX(IF(order_of_pages = 2, page_flag, NULL)) AS p2,
  MAX(IF(order_of_pages = 3, page_flag, NULL)) AS p3,
  MAX(IF(order_of_pages = 4, page_flag, NULL)) AS p4,
  MAX(IF(order_of_pages = 5, page_flag, NULL)) AS p5
FROM `mytable`
GROUP BY uniqueid 

您可以使用问题中的以下虚拟数据进行游戏/测试

#standardSQL
WITH `mytable` AS (
  SELECT 'A' AS uniqueid, 'Collection' AS page_flag, 1 AS order_of_pages UNION ALL
  SELECT 'A', 'Product', 2 UNION ALL
  SELECT 'A', 'Product', 3 UNION ALL
  SELECT 'A', 'Login', 4 UNION ALL
  SELECT 'A', 'Delivery', 5 UNION ALL
  SELECT 'B', 'Clearance', 1 UNION ALL
  SELECT 'B', 'Search', 2 UNION ALL
  SELECT 'B', 'Product', 3 UNION ALL
  SELECT 'C', 'Search', 1 UNION ALL
  SELECT 'C', 'Collection', 2 UNION ALL
  SELECT 'C', 'Product', 3 
)
SELECT 
  uniqueid,
  MAX(IF(order_of_pages = 1, page_flag, NULL)) AS p1,
  MAX(IF(order_of_pages = 2, page_flag, NULL)) AS p2,
  MAX(IF(order_of_pages = 3, page_flag, NULL)) AS p3,
  MAX(IF(order_of_pages = 4, page_flag, NULL)) AS p4,
  MAX(IF(order_of_pages = 5, page_flag, NULL)) AS p5
FROM `mytable`
GROUP BY uniqueid 
ORDER BY uniqueid   

结果是

uniqueid    p1          p2          p3      p4      p5   
A           Collection  Product     Product Login   Delivery     
B           Clearance   Search      Product null    null     
C           Search      Collection  Product null    null

取决于您的需求,您还可以考虑以下方法(虽然不是枢轴)

#standardSQL
SELECT uniqueid,
   STRING_AGG(page_flag, '>' ORDER BY order_of_pages) AS journey
FROM `mytable`
GROUP BY uniqueid
ORDER BY uniqueid   

如果使用与上述相同的虚拟数据运行 - 结果是

uniqueid    journey  
A           Collection>Product>Product>Login>Delivery    
B           Clearance>Search>Product     
C           Search>Collection>Product    
于 2017-08-01T13:25:47.020 回答