我正在将JOIN
用户更改日志样式表中的数据查找到具有匹配 ID 的事件表中
表格如下:
项目事件
图式
timestamp TIMESTAMP
event_id STRING
user_id STRING
data STRING
示例数据
| timestamp | event_id | user_id | data |
|-----------------------------|-----------|---------|------------------|
| 2020-08-22 17:01:18.807 UTC | hHZuTE8Y= | ABC123 | {"some":"json" } |
| 2020-08-20 16:57:28.022 UTC | tF5Gky8Q= | ZXY432 | {"foo":"item" } |
| 2020-08-15 16:44:25.607 UTC | 1dOU8pOo= | ABC123 | {"bar":"val" } |
users_changelog
图式
timestamp TIMESTAMP
event_id STRING
operation STRING
user_id STRING
data STRING
示例数据
| timestamp | event_id | operation | user_id | data |
|-----------------------------|-----------|-----------|---------|---------------------|
| 2020-08-30 12:50:59.036 UTC | mGdNKy+o= | DELETE | ABC123 | {"name":"removed" } |
| 2020-08-20 16:50:59.036 UTC | mGdNKy+o= | UPDATE | ABC123 | {"name":"final" } |
| 2020-08-05 20:45:36.936 UTC | mIICo9LY= | UPDATE | ZXY432 | {"name":"asdf" } |
| 2020-08-05 20:45:21.023 UTC | nEDKyCks= | UPDATE | ABC123 | {"name":"other" } |
| 2020-08-03 12:40:49.036 UTC | GxnbUqQ0= | CREATE | ABC123 | {"name":"initial" } |
| 1970-01-01 00:00:00 UTC | 1y+6fVWo= | IMPORT | ZXY432 | {"name":"test" } |
注意:操作可以是“CREATE”、“UPDATE”、“DELETE”或“IMPORT”。由于用户可以多次更新,因此可以有多个具有相同 user_id 的行
目标是在用户表中显示与匹配 ID 的最新操作相关联的 event_id 和数据列。使用示例数据,预期结果将是:
| event_id | event_data | user_id | user_data |
|-----------|------------------|---------|-------------------|
| hHZuTE8Y= | {"some":"json" } | ABC123 | {"name":"final" } |
| tF5Gky8Q= | {"foo":"item" } | ZXY432 | {"name":"asdf" } |
| 1dOU8pOo= | {"bar":"val" } | ABC123 | {"name":"other" } |
我尝试了以下方法,但它会产生重复的行(更改日志表中的每一行都有一个匹配的 id):
SELECT
events.event_id as event_id,
events.data as event_data,
users.user_id as user_id,
users.data as user_data
FROM my_project.my_dataset.project_events as events
LEFT JOIN my_project.my_dataset.users_changelog as users
ON events.user_id = users.user_id AND users.timestamp <= events.timestamp