I am trying to adopt HUDI in our project.
I am looking for 3 levels of data.
Raw (S3) --> Cleaned (HUDI, append only) ---> Standard (HUDI, upserts)
The idea is to keep a Cleaned bucket for clean data with Append only mode.
This can be used by data scientists. Making this as HUDI will help in following GDPR regulations.
I am confused whether it is a good idea to use HUDI as an append only bucket?
Is there an issue in doing that?
Please give me some advice.