我有一个设置,其中来自同一租户的数据到达运动流中的多个分片。所以有些东西看起来像:
tenant_id product shard_id
XYZ car 1
XYZ car 2
XYZ bike 1
XYZ bike 2
XYZ car 1
我希望结果看起来像:
tenant_id product frequency
XYZ car 3
XYZ bike 2
我目前拥有的 SQL 产生以下输出,它似乎在每个分片的基础上进行聚合:
tenant_id product frequency
XYZ car 2
XYZ car 1
XYZ bike 1
XYZ bike 1
我目前的解决方案:
CREATE OR REPLACE STREAM "DESTINATION_SQL_STREAM" (
"tenant_id" varchar(128),
"product" varchar(128),
"frequency" DOUBLE
);
CREATE OR REPLACE PUMP "STREAM_PUMP" AS
INSERT INTO "DESTINATION_SQL_STREAM"
SELECT STREAM
"tenant_id",
"product",
count("product") as "frequency"
FROM "SOURCE_SQL_STREAM_001"
GROUP BY "tenant_id", "product";