0

我想选择第一次访问我们的应用程序的用户 以及他们在接下来的几周内event.name = "first_open"使用的后续访问 。event.name = "user_engagement"

到目前为止,我遇到的查询是:

SELECT user_dim.app_info.app_instance_id AS id,
    FORMAT_DATE('%Y-%W', PARSE_DATE('%Y%m%d', event.date)) AS period,
    event.name
FROM `database.app_events_*`,
UNNEST(event_dim) AS event
WHERE event.name IN ("first_open", "user_engagement")
AND (_TABLE_SUFFIX BETWEEN '20180205' AND '20180330')
GROUP BY id, period, event.name
HAVING COUNT(id) >=2
ORDER BY id asc

但它包括在此期间没有第一次打开应用程序的普通用户。我如何排除这些?

4

2 回答 2

0

像这样的东西?

#standardSQL
SELECT user_dim.app_info.app_instance_id AS id, COUNT(*) as visits
FROM `data.source`,
UNNEST(event_dim) AS event
WHERE event.name = 'user_engagement' AND 
user_dim.app_info.app_instance_id IN (SELECT 
user_dim.app_info.app_instance_id FROM UNNEST(event_dim) AS event 
WHERE event.name = 'first_open')
GROUP BY id
HAVING COUNT(id) >= 2
ORDER BY visits DESC
于 2018-04-04T13:41:18.913 回答
0

如果不了解您的事件结构或跟踪逻辑,将很难优化我的响应,但基本上我会采取的方法是对 ID 字段(例如id in ( select id from ... where event_name = 'first_open')进行子查询,而不是查找两者的单个查询;或者,如果您想确保事件实际发生在稍后的时间(假设您也在user_engagement第一个会话期间进行跟踪),请使用自联接来检查相同的 ID,但仅根据事件的时间戳检查较晚的事件,或会话标识符。

于 2018-04-02T07:44:58.060 回答