我尝试自己实现字数统计示例,这是我的映射器实现:
public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
Text word = new Text();
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
context.write(word, new IntWritable(1));
}
}
}
和减速机:
public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterator<IntWritable> values, Context context) throws IOException, InterruptedException {
int sum = 0;
while (values.hasNext())
sum += values.next().get();
context.write(key, new IntWritable(sum));
}
}
但是我执行此代码得到的输出看起来只是 mapper 的输出,例如,如果输入是“hello world hello”,则输出将是
hello 1
hello 1
world 1
我还在映射和归约之间使用组合器。谁能解释一下这段代码有什么问题?
非常感谢!