1

A file is stored in HDFS of size 260 MB whereas the HDFS default block size is 64 MB. Upon performing a map-reduce job against this file, I found the number of input splits it creates is only 4. how did it calculated.? where is the rest 4 MB.? Any input is much appreciated.

4

1 回答 1

1

输入拆分并不总是块大小。输入拆分是数据的逻辑表示。您的输入拆分可能是 63mb、67mb、65mb、65mb(或可能基于逻辑记录大小的其他大小)...请参阅下面链接中的示例...

Hadoop 输入拆分大小与块大小

另一个例子 - 见第 3.3 节...

于 2018-02-11T20:37:55.597 回答