Let's assume that we have 3 mappers (m1, m2 and m3) and 2 reducers (r1 and r2).
Each reducer fetches its input partitions from the generated files by each mapper.
From the job history, I can extract the total input for each reduce task, but I would like to know the contribution of each mapper to this reducer input ?
For example, the reducer r1 will receive an INPUT_r1 such as:
INPUT_r1 = ( partition fetched from m1 ) + ( partition fetched from m2 ) + ( partition fetched from m3 )
I would like to know the size of those partitions from mappers ?