Combiner 在 Mapper 之后和 Reducer 之前运行,它将接收给定节点上 Mapper 实例发出的所有数据作为输入。然后它向 Reducers 发出输出。所以组合器输入的记录应该少于地图输出。
12/08/29 13:38:49 INFO mapred.JobClient: Map-Reduce Framework
12/08/29 13:38:49 INFO mapred.JobClient: Reduce input groups=8649
12/08/29 13:38:49 INFO mapred.JobClient: Map output materialized bytes=306210
12/08/29 13:38:49 INFO mapred.JobClient: Combine output records=859412
12/08/29 13:38:49 INFO mapred.JobClient: Map input records=457272
12/08/29 13:38:49 INFO mapred.JobClient: Reduce shuffle bytes=0
12/08/29 13:38:49 INFO mapred.JobClient: Reduce output records=8649
12/08/29 13:38:49 INFO mapred.JobClient: Spilled Records=1632334
12/08/29 13:38:49 INFO mapred.JobClient: Map output bytes=331837344
12/08/29 13:38:49 INFO mapred.JobClient: **Combine input records=26154506**
12/08/29 13:38:49 INFO mapred.JobClient: **Map output records=25312392**
12/08/29 13:38:49 INFO mapred.JobClient: SPLIT_RAW_BYTES=218
12/08/29 13:38:49 INFO mapred.JobClient: Reduce input records=17298