0

我正在尝试在 Hortonworks Sandbox 中使用 mapreduce2(yarn) 拆分字符串。如果我尝试访问 val[1] ,它会引发 ArrayOutOfBound 异常,当我不拆分输入文件时可以正常工作。

映射器:

public class MapperClass extends Mapper<Object, Text, Text, Text> {

    private Text airline_id;
    private Text name;
    private Text country;
    private Text value1;

    public void map(Object key, Text value, Context context)
            throws IOException, InterruptedException {

        String s = value.toString();
        if (s.length() > 1) {

            String val[] = s.split(",");
            context.write(new Text("blah"), new Text(val[1]));
        }


    }
}

减速器:

public class ReducerClass extends Reducer<Text, Text, Text, Text> {

private Text result = new Text();

public void reduce(Text key, Iterable<Text> values, Context context)
        throws IOException, InterruptedException {

    String airports = "";

    if (key.equals("India")) {
        for (Text val : values) {
            airports += "\t" + val.toString();
        }
        result.set(airports);
        context.write(key, result);
    }
}
}

主类:

public class MainClass {

public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {

    Configuration conf = new Configuration();
    @SuppressWarnings("deprecation")
    Job job = new Job(conf, "Flights MR");

    job.setJarByClass(MainClass.class);
    job.setMapperClass(MapperClass.class);
    job.setReducerClass(ReducerClass.class);

    job.setNumReduceTasks(0);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(Text.class);

    job.setInputFormatClass(KeyValueTextInputFormat.class);

    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));

    System.exit(job.waitForCompletion(true) ? 0 : 1);

}

}

你能帮我吗?

更新:

发现它不会将文本转换为字符串。

4

1 回答 1

0

If the string you are splitting does not contain a comma, the resulting String[] will be of length 1 with the entire string in at val[0].

Currently, you are making sure that the string is not the empty string

if (s.length() > -1)

But you are not checking that the split will actually result in an array of length more than 1 and assuming that there was a split.

context.write(new Text("blah"), new Text(val[1]));

If there was no split this will cause an out of bounds error. A possible solution would be to make sure that the string contains at least 1 comma, instead of checking that it is not the empty string like so:

String s = value.toString();
if (s.indexOf(',') > -1) {

    String val[] = s.split(",");
    context.write(new Text("blah"), new Text(val[1]));
}
于 2017-02-03T23:53:37.290 回答