amazon-web-services - 将 DynamoDB 数据加载到 Redshift 的步骤？

Question

我想知道我们如何将数据从 DynamoDB 加载到 Redshift。

根据文档，DynamoDB 是 NoSQL，Redshift 是 RDBMS。

那么如何以标准化的方式处理非结构化数据呢？

我什么时候需要规范化数据？

我想知道 Redshift 是否保留完整数据或转换后的数据。

我想知道加载增量数据的最佳方式。

谁能建议这个过程的步骤？

score 3 · Accepted Answer

Loading Data from DynamoDB

The Amazon Redshift COPY command can be used to load a DynamoDB table into a Redshift table. This will load the complete DynamoDB table into Redshift.

See documentation: Loading Data from an Amazon DynamoDB Table

Column names are mapped, and only columns that have matching column names are loaded.

Loading incremental data

To perform an incremental load (eg only where Country='USA'), first load the complete table into a temporary table, then perform normal INSERT SQL commands in Redshift to insert/copy the desired data.

See:

Documentation: Updating and Inserting New Data
StackOverflow: Loading data (incrementally) into Amazon Redshift, S3 vs DynamoDB vs Insert

Normalization, foreign & primary keys

DynamoDB is a NoSQL database, so there are no relational concepts between tables and no foreign keys.

When creating tables in Redshift that will receive your data from DynamoDB, you can specify Foreign Keys. These are not enforced by Redshift, but they are used by the query optimizer.

Once data has been imported into Redshift, you can perform relational queries (eg using JOIN) between tables.

Your data does not need to be normalized. In fact, Data Warehouses such as Redshift are often loaded with wide tables and duplicated data that make it easier to query data with fewer JOINS.

amazon-web-services - 将 DynamoDB 数据加载到 Redshift 的步骤？

1 回答 1

Related

Reference