amazon-kinesis - Differences in PipelineDB and AWS Kinesis Analytics use of Reference data

Question

I'm doing a comparison of AWS Kinesis Analytics to PipelineDB use of "reference" data in STREAM SQL.

http://docs.aws.amazon.com/kinesisanalytics/latest/dev/limits.html http://docs.pipelinedb.com/joins.html#joins

Question 1: JOIN on multiple reference tables

AWS Kinesis Analytics - only lets you join to reference data from one source. That seems really restrictive! Unless I am not understanding it. I'd want to be able to JOIN on say, USERS, and an ADDRESS reference data. I can't?

PipelineDB - says it supports JOINs, but the docs don't have JOIN examples to multiple reference tables. Does PipelineDB support joining multiple reference tables in it's STREAMS and/or CONTINUOUS VIEWs?

Question 2: Refreshing reference data

AWS Kinesis Analytics - says you have to jump through some hoops (e.g. calling AWS APIs, etc.) to refresh reference data stored in its S3 bucket for the stream

PipelineDB - Can streams simply get the latest reference data as it is updated using standard SQL updates to the reference tables?

Can PipelineDB JOIN to regular SQL VIEWs, so, in essence the SQL VIEW is updated automatically each time the underlying data is changed?

score 0 · Accepted Answer

PipelineDB 允许您使用任意数量JOIN的表，包括其他连续视图或常规视图。唯一不能JOIN使用流的是另一个流（没有 stream-stream JOINs）。
无论何时存在的“参考数据”JOIN都将用于更新连续视图。换句话说，事后更新参考数据不会自动更改连续视图中的历史数据，但新传入的行将反映更新后的参考数据。

这是包含多个连续视图定义的示例JOINs：

https://github.com/pipelinedb/pipelinedb/blob/master/src/test/regress/sql/stream_table_join.sql#L61

amazon-kinesis - Differences in PipelineDB and AWS Kinesis Analytics use of Reference data

1 回答 1

Related

Reference