你为什么不直接写一个SequenceFilewith<Text,BytesWritable>而不是实现你自己的格式呢?
一些随机图像的示例,您应该将路径存储在yourImagePaths:
// omitted try / catch and finally statements
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(conf);
Path output = new Path("/tmp/out.seq");
List<String> yourImagePaths = new LinkedList<>();
    // TODO fill your image paths here
SequenceFile.Writer writer = new SequenceFile.Writer(fs, conf, output,
    Text.class, BytesWritable.class);
for (String file : yourImagePaths) {
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    org.apache.hadoop.io.IOUtils.copyBytes(fs.open(new Path(file)), out, conf);
    writer.append(new Text(file), new BytesWritable(out.toByteArray()));
}
writer.close();
基本上它将路径写为键(以识别你的图像)和图像中的原始字节作为值。
现在您可以在 Hadoop 作业中读取它,它会自动被拆分。您只需要说输入键是Text,值是BytesWritable并且  SequenceFileInputFormat必须使用。