组合器处理映射器的输出记录。如果将映射器输出记录馈送到组合器,那么为什么我的组合器输入记录多于映射器输出记录?
我额外获得了这 80 条记录。我不知道它们来自哪里以及它们的价值是什么。
Mapreduce 的纱线转储:
Map-Reduce Framework
Map input records=80000000
Map output records=80000000
Map output bytes=2560000000
Map output materialized bytes=80
Input split bytes=220
Combine input records=80000083
Combine output records=85
Reduce input groups=1
Reduce shuffle bytes=80
Reduce input records=2
Reduce output records=3
Spilled Records=87
Shuffled Maps =2
Failed Shuffles=0
Merged Map outputs=2
GC time elapsed (ms)=4124
CPU time spent (ms)=90530
Physical memory (bytes) snapshot=573521920
Virtual memory (bytes) snapshot=2509766656
Total committed heap usage (bytes)=411041792