0

我有一张桌子叫event_user_fav_color_changed. 表中的每一行代表用户更改他们喜欢的颜色的事件。对于特定范围内的每个日期,我想获取每个用户在该日期最喜欢的颜色。

这是一个示例event_user_fav_color_changed表:

user_id    date          updated_at_datetime    fav_color  
1234       2020-01-01    2020-01-01 12:00:03    blue
1234       2020-01-05    2020-01-05 10:30:00    green

这是一个包含我感兴趣的用户和日期的示例表:

user_id    date      
1234       2020-01-01
1234       2020-01-04
1234       2020-01-05
1234       2020-01-06

这是所需的输出:

user_id    date         fav_color
1234       2020-01-01   blue
1234       2020-01-04   blue
1234       2020-01-05   green
1234       2020-01-06   green
4

4 回答 4

2

一种选择使用相关子查询。假设您的 user/dates 表被调用data,您将执行以下操作:

select
    d.*,
    (   
        select e.fav_color 
        from event_user_fav_color_changed e
        where e.user_id = d.user_id and e.date <= d.date
        order by e.date desc limit 1
    )
from data d
于 2020-09-03T07:49:53.790 回答
0

您可以使用row_number()窗口功能

select * from
(
select user_id, date, updated_at_datetime, fav_color, 
row_number() over(partition by user_id,date order by updated_at_datetime desc) as rn
from tablename
)A where rn=1
于 2020-09-03T06:28:24.300 回答
0

听起来您无法将查找限制在任何特定范围内。所以基本上每一行都必须搜索最后一次发生的更新。

select d.date,
  (
    select first_value(fav_color) over (order by updated_at_datetime desc)
    from event_user_fav_color_changed
    where updated_at_datetime < d.date
  ) as fav_as_of
from dates d

我对 Presto 一无所知,但我相信这个查询应该有效。

于 2020-09-03T06:44:57.947 回答
0

表达这一点的一种方法是使用连接和row_number()

select uc.*
from (select ufcc.*,
             row_number() over (partition by ufcc.user_id order by ufcc.date desc) as seqnum
      from user_dates ud join
           event_user_fav_color_changed ufcc
           on ud.user_id = ufcc.user_id and
              ud.date > ufcc.date
     ) uc
where seqnum = 1;

如果有很多颜色变化,那可能效率低下。使用连接lead()可能更有效:

select ufcc.*
from user_dates ud join
     (select ufcc.*,
             lead(ufcc.date) over (partition by ufcc.user_id order by ufcc.date) as next_date
      from event_user_fav_color_changed ufcc
     ) ufcc
    on ud.user_id = ufcc.user_id and
       ud.date > ufcc.date and
       (ud.date <= ufcc.next_date or ufcc.next_date is null);

或横向连接:

select ufcc.*
from user_dates ud cross join lateral
     (select ufcc.*
      from event_user_fav_color_changed ufcc
      where ud.user_id = ufcc.user_id and
            ud.date > ufcc.date
      order by ufcc.date desc
      limit 1
     ) ud
于 2020-09-03T12:08:33.380 回答