1

我想分析一些点击流数据,以确定哪些付费广告系列促成了最多的转化。

我在数据库中有一个表,其中包含以下内容:

user_id |   sent_at        |   campaign_name    |  last_click_attribution   
  101   | 2018-10-01 13:04 |   Google_Branded   |  Facebook_Focus
  101   | 2018-10-01 13:07 |   Google_Branded   |  Facebook_Focus 
  101   | 2018-10-02 13:09 |   Facebook_Focus   |  Facebook_Focus
  102   | 2018-09-25 13:04 |   Google_Focus     |  Google_Branded
  102   | 2018-09-27 09:24 |   Google_Branded   |  Google_Branded
  102   | 2018-10-01 11:25 |   Google_Branded   |  Google_Branded
  103   | 2018-09-27 13:04 |   Google_Branded   |  Google_Branded
  103   | 2018-09-28 09:15 |   Google_Branded   |  Google_Branded
  103   | 2018-09-29 18:34 |   Google_Branded   |  Google_Branded
  103   | 2018-09-30 21:02 |   Google_Branded   |  Google_Branded

活动名称是与他们点击访问我们网站的广告相关联的活动。最终点击归因是他们在创建用户帐户之前最后点击的广告。

我想创建一个 PostgreSQL 查询,它将具有以下内容:

user_id |   last_click_attribution |   second_last_ad    |  third_last_ad  |....   
  101   | Facebook_Focus           |   Google_Branded    |  Google_Branded
  102   | Google_Branded           |   Google_Branded    |  Google Focus 
  103   | Google_Branded           |   Google_Branded    |  Google_Branded

我认为有一种方法可以通过交叉表或加入两个视图来做到这一点,但我不确定如何完成它。

谢谢您的帮助!

如果您对分析有价值的点击流数据以及要参考的 SQL 查询示例有任何其他建议,我们也将不胜感激。

4

1 回答 1

2

您可以尝试在子查询中使用生成行号,然后使用条件聚合函数来生成。

CREATE TABLE T(
   user_id int,
   sent_at timestamp,
   campaign_name varchar(50)
);


INSERT INTO T VALUES (101, '2018-10-01 13:04','Google_Branded');   
INSERT INTO T VALUES (101, '2018-10-01 13:07','Google_Branded');   
INSERT INTO T VALUES (101, '2018-10-02 13:09','Facebook_Focus');   
INSERT INTO T VALUES (102, '2018-09-25 13:04','Google_Focus');     
INSERT INTO T VALUES (102, '2018-09-27 09:24','Google_Branded');   
INSERT INTO T VALUES (102, '2018-10-01 11:25','Google_Branded');   
INSERT INTO T VALUES (103, '2018-09-27 13:04','Google_Branded');   
INSERT INTO T VALUES (103, '2018-09-28 09:15','Google_Branded');   
INSERT INTO T VALUES (103, '2018-09-29 18:34','Google_Branded');   
INSERT INTO T VALUES (103, '2018-09-30 21:02','Google_Branded');   

查询 1

SELECT  user_id,
        MAX(CASE WHEN rn = 1 then campaign_name end) last_click_attribution,
        MAX(CASE WHEN rn = 2 then campaign_name end) second_last_ad,
        MAX(CASE WHEN rn = 3 then campaign_name end) third_last_ad,
        MAX(CASE WHEN rn = 4 then campaign_name end) fourth_last_ad
FROM (
  select *,row_number() over(partition by user_id ORDER by sent_at desc) rn
  from T
) t1
group by user_id

结果

| user_id | last_click_attribution | second_last_ad |  third_last_ad | fourth_last_ad |
|---------|------------------------|----------------|----------------|----------------|
|     101 |         Facebook_Focus | Google_Branded | Google_Branded |         (null) |
|     102 |         Google_Branded | Google_Branded |   Google_Focus |         (null) |
|     103 |         Google_Branded | Google_Branded | Google_Branded | Google_Branded |
于 2018-10-02T21:34:17.487 回答