0

我有四个类,即 MapperOne、ReducerOne、MapperTwo、ReducerTwo。我想要其中的一个链。MapperOne-->ReducerOne-->输出文件生成,输入到MapperTwo-->MapperTwo-->ReducerTwo-->最终输出文件。

我的司机等级代码:

public class StockDriver {


    public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException {
        System.out.println(" Driver invoked------");
        Configuration config = new Configuration();
        config.set("mapreduce.input.keyvaluelinerecordreader.key.value.separator", " ");
        config.set("mapred.textoutputformat.separator", " --> ");
        
        String inputPath="In\\NYSE_daily_prices_Q_less.csv";
         
        String outpath = "C:\\Users\\Outputs\\run1";
        String outpath2 = "C:\\UsersOutputs\\run2";
        
        Job job1 = new Job(config,"Stock Analysis: Creating key values");
        job1.setInputFormatClass(TextInputFormat.class);
        job1.setOutputFormatClass(TextOutputFormat.class);
        
        job1.setMapOutputKeyClass(Text.class);
        job1.setMapOutputValueClass(StockDetailsTuple.class);
        job1.setOutputKeyClass(Text.class);
        job1.setOutputValueClass(Text.class);
        
        job1.setMapperClass(StockMapperOne.class);
        job1.setReducerClass(StockReducerOne.class);
                
        FileInputFormat.setInputPaths(job1, new Path(inputPath));
        FileOutputFormat.setOutputPath(job1, new Path(outpath));
                
        //THE SECOND MAP_REDUCE TO DO CALCULATIONS
            
        
        Job job2 = new Job(config,"Stock Analysis: Calculating Covariance");
        job2.setInputFormatClass(TextInputFormat.class);
        job2.setOutputFormatClass(TextOutputFormat.class);
        job2.setMapOutputKeyClass(LongWritable.class);
        job2.setMapOutputValueClass(Text.class);
        job2.setOutputKeyClass(Text.class);
        job2.setOutputValueClass(Text.class);
        job2.setMapperClass(StockMapperTwo.class);
        job2.setReducerClass(StockReducerTwo.class);
            
        
        String outpath3=outpath+"\\part-r-00000";
        System.out.println("OUT PATH 3: " +outpath3 );
        FileInputFormat.setInputPaths(job2, new Path(outpath3));
        FileOutputFormat.setOutputPath(job2, new Path(outpath2));
        
        
        if(job1.waitForCompletion(true)){
        System.out.println(job2.waitForCompletion(true));
        }
    }

}

我的 MapperOne 和 ReducerOne 正在正确执行,并且输出文件存储在正确的路径中。现在,当执行第二个作业时,只会调用 reducer。下面是我的 MapperTwo 和 ReducerTwo 代码。

映射器二

public class StockMapperTwo extends Mapper<Text, Text, LongWritable, Text> {

    public void map(LongWritable key, Iterable<Text> values, Context context) throws IOException, InterruptedException{
        System.out.println("------ MAPPER 2 CALLED-----");
        
        for(Text val: values){
            System.out.println("KEY: "+ key.toString() + "   VALUE: "+ val.toString());
            //context.write(new Text("mapper2"), new Text("hi"));
            context.write(new LongWritable(2), new Text("hi"));
        }
        
    }
}

减速机二

public class StockReducerTwo extends Reducer<LongWritable, Text, Text, Text>{

    public void reduce(LongWritable key, Iterable<Text>values, Context context) throws IOException, InterruptedException{
        
            System.out.println(" REDUCER 2 INVOKED");
        
            context.write(new Text("hello"), new Text("hi"));   
    }
}

我对这个配置的怀疑是

  1. 为什么即使在 job2.setMapperClass(StockMapperTwo.class) 中设置了映射器,它也会被跳过;

  2. 如果我不设置job2.setMapOutputKeyClass(LongWritable.class); job2.setMapOutputValueClass(Text.class);,那么即使减速器也不会被调用。这个错误来了。

java.io.IOException:映射中的键类型不匹配:预期的 org.apache.hadoop.io.Text,在 org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect 处收到 org.apache.hadoop.io.LongWritable(MapTask .java:870) 在 org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) 在 org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) 在 org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:573)。 hadoop.mapreduce.Mapper.map(Mapper.java:124) 在 org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)

这是怎么回事?我无法正确调用我的映射器和减速器。

4

1 回答 1

1

很抱歉发布这个问题。我没有观察到我的映射器写错了。

安装了这个

public void map(LongWritable key,Text values, Context context) throws IOException, InterruptedException{

我保持它喜欢

public void map(LongWritable key, Iterable<Text> values, Context context) throws IOException, InterruptedException{

我花了很长时间才发现这个错误。我不确定为什么没有适当的错误来显示错误。反正现在解决了。

于 2013-07-16T11:10:58.203 回答