我正在尝试计算唯一访问者的数量。我首先按总检查它,而没有按任何时间框架将其分开。
主表(大数据表示例):
+-----------+----+-------+
|theDateTime|vD | vis |
+----------------+-------+
|2018-10-03 |123 |abc |
|2018-10-04 |123 |abc |
|2018-10-04 |123 |pqr |
|2018-10-05 |123 |xyz |
+-----------+----+-------+
上述的总不同计数将为 3 但当我按天分组时abc
计数两次。先是3号,然后是2号。我只想计算第一个。
我的总查询:
select
d.eId AS vD
, COUNT(DISTINCT visitorId) AS vis
from decisions
WHERE d.eId = 123
AND timestamp BETWEEN unix_timestamp('2018-10-03 00:00:00')*1000 AND
unix_timestamp('2018-10-06 12:17:00')*1000
GROUP BY d.eId
ORDER BY vId
我的结果:
+----+---------+
| vD | vis |
+----+---------+
|123 | 3 |
+----+---------+
我的每日查询:
select DISTINCT
cast(from_unixtime(timestamp DIV 1000) AS date) AS theDateTime
, d.eId AS vD
, COUNT(DISTINCT visitorId) AS vis
from decisions
WHERE timestamp BETWEEN unix_timestamp('2018-10-03 00:00:00')*1000 AND
unix_timestamp('2018-10-06 12:17:00')*1000
AND d.eId IN (11550123588)
GROUP BY cast(from_unixtime(timestamp DIV 1000) AS date),
d.vD
ORDER BY vD, theDateTime
我的结果:
+-----------+----+-------+
|theDateTime|vD | vis |
+----------------+-------+
|2018-10-03 |123 | 1 |
|2018-10-04 |123 | 2 |
|2018-10-05 |123 | 1 |
+-----------+----+-------+
总数为1122585。WH这超过总和
我知道这是因为以防万一访客在不同的一天重复,当我按天分组时,他被计算了两次。如果他已经在第 1 天被统计,我有没有办法在第 2 天不统计访客?
请帮忙!