I have the following transactions
table:
customer_id purchase_date product category department quantity store_id
1 2020-10-01 Kit Kat Candy Food 2 store_A
1 2020-10-01 Snickers Candy Food 1 store_A
1 2020-10-01 Snickers Candy Food 1 store_A
2 2020-10-01 Snickers Candy Food 2 store_A
2 2020-10-01 Baguette Bread Food 5 store_A
2 2020-10-01 iPhone Cell phones Electronics 2 store_A
3 2020-10-01 Sony PS5 Games Electronics 1 store_A
I would like to calculate the average number of products purchased (for each product
in the table). I'm also looking to calculate averages across each category
and each department
by accounting for all products within the same category
or department
respectively. Care should be taken to divide over unique customers AND the product quantity
being greater than 0 (a 0 quantity indicates a refund, and should not be accounted for).
So basically, the output table would like below:
...where store_id
and average_level_type
are partition columns.
Is there a way to achieve this in a single pass over the transactions table? or do I need to break down my approach into multiple steps?
Thanks!