1

I'm having a issue getting started with my first map-reduce code on Hadoop. I copied the following code from "Hadoop: The definitive guide" but I'm not able to run it on my single node Hadoop installation.

My Code snippet:

Main:

Job job = new Job(); 
job.setJarByClass(MaxTemperature.class);
job.setJobName("Max temperature");

FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));

job.setMapperClass(MaxTemperatureMapper.class);
job.setReducerClass(MaxTemperatureReducer.class);

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);

System.exit(job.waitForCompletion(true) ? 0 : 1);

Mapper:

public void map(LongWritable key, Text value, Context context)

Reducer:

public void reduce(Text key, Iterable<IntWritable> values,
Context context)

Implementations of map and reduce function are also picked from the book only. But when I try to execute this code, this is the error I get:

INFO mapred.JobClient: Task Id : attempt_201304021022_0016_m_000000_0, Status : FAILED
    java.lang.ClassCastException: interface javax.xml.soap.Text
    at java.lang.Class.asSubclass(Class.java:3027)
    at org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:774)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:959)
    at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:674)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)

Answers to similar questions in the past (Hadoop type mismatch in key from map expected value Text received value LongWritable) helped me to figure out that InputFormatClass should match the input to the map function. So I also tried using job.setInputFormatClass(TextInputFormat.class); in my main method, but it also did not solve the issue. What could be the issue here?

Here is the implementation of the Mapper class

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class MaxTemperatureMapper extends Mapper<LongWritable, Text, Text, IntWritable>     {

private static final int MISSING = 9999;

@Override
public void map(LongWritable key, Text value, Context context)
  throws IOException, InterruptedException {

  String line = value.toString();
  String year = line.substring(15, 19);

  int airTemperature;
  if (line.charAt(45) == '+') { // parseInt doesn't like leading plus signs
    airTemperature = Integer.parseInt(line.substring(46, 50));
  } else {
    airTemperature = Integer.parseInt(line.substring(45, 50));
  }
  String quality = line.substring(50, 51);
  if (airTemperature != MISSING && quality.matches("[01459]")) {
    context.write(new Text(year), new IntWritable(airTemperature));
  }
 }

}
4

2 回答 2

3

您自动导入了错误的导入。而不是 import org.apache.hadoop.io.Text您导入import javax.xml.soap.Text

您可以在此博客中找到示例错误导入。

于 2014-09-23T06:30:45.120 回答
2

看起来您导入了错误的 Text 类 (javax.xml.soap.Text)。你想要 org.apache.hadoop.io.Text

于 2013-04-04T14:57:50.897 回答