hadoop - Calculating input splits in MapReduce

Question

A file is stored in HDFS of size 260 MB whereas the HDFS default block size is 64 MB. Upon performing a map-reduce job against this file, I found the number of input splits it creates is only 4. how did it calculated.? where is the rest 4 MB.? Any input is much appreciated.

score 1 · Accepted Answer

输入拆分并不总是块大小。输入拆分是数据的逻辑表示。您的输入拆分可能是 63mb、67mb、65mb、65mb（或可能基于逻辑记录大小的其他大小）...请参阅下面链接中的示例...

Hadoop 输入拆分大小与块大小

另一个例子 - 见第 3.3 节...

hadoop - Calculating input splits in MapReduce

1 回答 1

Related

Reference