嗨,我是 MapReduce 的初学者,我想对 WordCount 进行编程,以便输出 K/V 对。但问题是我不想使用“制表符”作为文件的键值对分隔符。我怎么能改变它?
我使用的代码与示例代码略有不同。这是驱动程序类。
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "Job1");
job.setJarByClass(Simpletask.class);
job.setMapperClass(TokenizerMapper.class);
//job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(Text.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
LazyOutputFormat.setOutputFormatClass(job, TextOutputFormat.class);
由于我希望文件名与reducer的分区相对应,因此我在reduce函数中使用了multipleout.write(),因此代码略有不同。
public void reduce(IntWritable key,Iterable<Text> values, Context context) throws IOException, InterruptedException {
String accu = "";
for (Text val : values) {
String[] entry=val.toString().split(",");
String MBR = entry[1];
//ASSUME MBR IS ENTRY 1. IT CAN BE REPLACED BY INVOKING FUNCTION TO CALCULATE MBR([COORDINATES])
String mes_line = entry[0]+",MBR"+MBR+" ";
result.set(mes_line);
mos.write(key, result, generateFileName(key));
}
任何帮助将不胜感激!谢谢!