使用 Oracle SQL,我正在尝试计算对网站的总唯一访问次数。我用来编写查询的表没有时间戳,其中包括分钟和秒,只是 DDMMYY,表中的每一行都代表客户在页面上的点击。该表每小时指定一个新的“会话”,无论这是否真的反映了客户 POV 的新访问。我必须做的是使用非连续会话作为唯一访问的代理。因此,如果访问之间有一个小时的休息时间,则之前的连续分组是一次访问。我将访问定义为客户 ID + 会话日 + 会话小时的唯一组合。如果客户+天组合中有连续的会话时间,我将其计为一个会话。HOUR 字段包含将日期与小时连接起来的字符串值。
Example of Raw Data:
TRANS_TO_DATE CUSTOMER_ID HOUR
10/21/17 1007589445 October 21, 2017, Hour 1
10/21/17 1007589445 October 21, 2017, Hour 2
10/21/17 1007589445 October 21, 2017, Hour 2
10/21/17 1007589445 October 21, 2017, Hour 2
10/21/17 1007589445 October 21, 2017, Hour 3
10/21/17 1007589445 October 21, 2017, Hour 5
10/21/17 1007589445 October 21, 2017, Hour 6
10/21/17 1007589445 October 21, 2017, Hour 23
10/21/17 1007589445 October 21, 2017, Hour 23
10/21/17 1007589445 October 21, 2017, Hour 23
11/1/17 1007589445 November 1, 2017, Hour 10
1/1/18 1007589445 January 1, 2018, Hour 10
1/1/18 1007589445 January 1, 2018, Hour 10
1/1/18 1007589445 January 1, 2018, Hour 11
1/1/18 1007589445 January 1, 2018, Hour 14
1/1/18 1007589445 January 1, 2018, Hour 20
1/1/18 1007589445 January 1, 2018, Hour 22
访问次数实际上是这样的:
Customer_id Day Hour Visit Grouping
1007589445 October 21, 2017 1 Visit 1
1007589445 October 21, 2017 2 Visit 1
1007589445 October 21, 2017 3 Visit 1
1007589445 October 21, 2017 5 Visit 2
1007589445 October 21, 2017 6 Visit 2
1007589445 October 21, 2017 23 Visit 3
1007589445 November 1, 2017 10 Visit 1
1007589445 January 1, 2018 10 Visit 1
1007589445 January 1, 2018 11 Visit 1
1007589445 January 1, 2018 14 Visit 2
1007589445 January 1, 2018 20 Visit 3
1007589445 January 1, 2018 21 Visit 4
客户1007589445有
2017 年 10 月 21 日访问 3 次 - 2017 年 11 月 1 日访问 1 次 - 2018 年 1 月 1 日访问 4 次
总访问量:8
下面是我到目前为止需要修改以满足上述标准的 sql 代码。
select
CUSTOMER_ID,
TRANS_TO_DATE,
HOUR,
count (HOUR) as visits
from mstr_clickstream_vw
where trans_to_date between start_date and end_date
and web_store_ind='US'
group by CUSTOMER_ID, TRANS_TO_DATE,HOUR