在 BigQuery 中创建了以下查询:
SELECT
date,
userId,
SUM(totals.visits) totalvisits,
GROUP_CONCAT(device.deviceCategory) sequentialdevice
FROM (
SELECT
date,
visitStartTime,
customDimensions.value userId,
totals.visits,
device.deviceCategory
FROM
TABLE_DATE_RANGE([164345793.ga_sessions_], TIMESTAMP('20171127'), CURRENT_TIMESTAMP())
WHERE
customDimensions.index = 1
AND customDimensions.value CONTAINS "hip|"
GROUP BY
date,
visitStartTime,
userId,
totals.visits,
device.deviceCategory
HAVING
userId="hip|7e4fbce9-bbfb-4677-aab0-dcd02851fdb4"
ORDER BY
date ASC,
visitStartTime ASC)
GROUP BY
date,
userId
作为临时措施,我使用 having 子句对其进行测试(这将在生产中删除)查询输出以下内容:
这一切都很好并且按预期工作,以适当的顺序输出设备(平板电脑,平板电脑,平板电脑,手机,桌面) - 但是,我想从中删除重复项,所以结果将是“平板电脑,手机,桌面”
我尝试使用 Unique() 函数,这会删除重复项,但是顺序不会保留,因此输出变为“桌面、移动、平板电脑”
任何帮助,将不胜感激!
更新
我将查询更新为标准 SQL,现在使用 string_agg() 函数遇到了另一个问题:
SELECT
date,
userId,
totalsvisits,
STRING_AGG(DISTINCT devicecategory ORDER BY date ASC, vstime ASC) deviceAgg
FROM (
SELECT
date,
visitStartTime vstime,
cd.value userId,
totals.visits totalsvisits,
device.deviceCategory devicecategory
FROM
`12314124123123.ga_sessions_*`,
UNNEST(customDimensions) AS cd
WHERE
cd.index=1
AND cd.value IS NOT NULL
GROUP BY
date,
visitStartTime,
userId,
totals.visits,
device.deviceCategory
HAVING
userId="hip|7e4fbce9-bbfb-4677-aab0-dcd02851fdb4"
ORDER BY
date ASC,
visitStartTime ASC)
GROUP BY
date,
userId,
totalsvisits
返回的错误是“具有 DISTINCT 和 ORDER BY 参数的聚合函数只能 ORDER BY 作为函数参数的列”
显然,如果我们从 string_agg 中删除 distinct 或 order by 子句,这是可行的,但我们需要这两个操作。