输入 :
a,b,c,d,e
q,w,34,r,e
1,2,3,4,e
在映射器中,我将获取最后一个字段的所有值,并且我想发出 (e,(a,b,c,d)) 即它发出 (key, (该行中的其余字段))。
帮助表示赞赏。
当前代码:
public static class Map extends Mapper<LongWritable, Text, Text, Text> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String line = value.toString(); // reads the input line by line
String[] attr = line.split(","); // extract each attribute values from the csv record
context.write(attr[argno-1],line); // gives error seems to like only integer? how to override this?
}
}
}
public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
// further process , loads the chunk into 2d arraylist object for processing
}
public static void main(String[] args) throws Exception {
String line;
String arguements[];
Configuration conf = new Configuration();
// compute the total number of attributes in the file
FileReader infile = new FileReader(args[0]);
BufferedReader bufread = new BufferedReader(infile);
line = bufread.readLine();
arguements = line.split(","); // split the fields separated by comma
conf.setInt("argno", arguements.length); // saving that attribute value
Job job = new Job(conf, "nb");
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
job.setMapperClass(Map.class); /* The method setMapperClass(Class<? extends Mapper>) in the type Job is not applicable for the arguments (Class<Map>) */
job.setReducerClass(Reduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}`
请注意我遇到的错误(见评论)。