我试图运行 WordCount 示例的变体,变体是,Mapper 输出 Text 作为键,Text 作为值,reducer 输出 Text 作为键,NullWritable 作为值。
除了 map,reduce 签名,我把 main 方法是这样的:
//start a conf
Configuration conf = new Configuration();
conf.set("str",str);
//initialize a job based on the conf
Job job = new Job(conf, "wordcount");
job.setJarByClass(org.myorg.WordCount.class);
//the reduce output
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(NullWritable.class);
//the map output
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
//Map and Reduce
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
//take hdfs locations as input and output
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
//run the job
job.waitForCompletion(true);
为了调试,我把 map 函数作为
map(LongWritable key, Text value, Context context){
.........
context.write("1000000","2");
}
并将代码减少为
reduce(Text key, Iterable<Text> values, Context context){
.......
context.write("v",NullWritable.get());
}
但是,我在输出中看到的只是地图输出。减速器编译,但甚至没有被调用!我相信我可能在 main() 方法中遗漏了一些东西,它的代码已被描述,但还剩下什么?我看不到作业配置还需要什么信息。
谢谢,