1

I need to calculate the progress of each map task running on all nodes in a Hadoop cluster. I was thinking of dividing the size of the processed data by the size of the whole input data, but I am not sure how to get this information for a task.

I see that TaskStatus class has a method getProgress(), but there is no description for it. Does it provide the value that I need?

4

1 回答 1

2

For a map task, yes getProgress() returns how far the mapper has progressed through the input file. For reduce tasks, the calculation is less straightforward. This article has a pretty good explanation.

于 2013-03-31T22:35:28.697 回答