Yes, above strategy is not just fine, its good. I use it in production system and it works great, though you have to careful and craft this strategy to make sure that it solves your use case effectively and efficiently.
Here is few points, what I mean by effectively and efficiently.
- Make sure you have most efficient way to identify the records to be pushed to
Redshift, meaning identify the potential records with optimized queries that includes CPU, Memory.
- Make sure to use optimized way to send the identified to
redshift that includes data size optimization, so that it uses minimum storage and network bandwidth. e.g. compress and gzip CSV files, so that it takes minimum size in S3 storage and save network bandwidth.
- Try to run
copy redshift queries in a way that it executes in parallel.
Hope this will help.