We have a requirement wherein we log events in a DynamoDB table whenever an ad is served to the end user. There are more than 250 writes into this table per sec in the dynamoDB table.
We would want to aggregate and move this data to Redshift for analytics.
The DynamoDB stream will be called for every insert made in the table i suppose. How can I feed the DynamoDB stream into some kind of batches and then process those batches. Are there any best practices around such kind of use cases ?
I was reading about apache spark and seems like with Apache Spark we can do such kind of aggregation. But apache spark stream does not read the DynamoDB stream.
Any help or pointers is appreciated.
Thanks