有没有办法将系统参数(类似于-Dmy_param = XXX)传递给hadoop map reduce框架中的映射函数。通过 .setJarByClass() 向 hadoop 集群提交作业。在映射器中,我必须创建配置,所以我想让它可配置,所以我认为通过属性文件的标准方式就可以了。只是在设置属性的位置传递参数。另一种方法是将属性文件添加到提交的 jar 中。有人有经验如何解决这个问题吗?
问问题
3104 次
1 回答
8
如果您尚未在工作中使用它,您可以尝试使用 GenericOptionsParser、Tool 和 ToolRunner 来运行 Hadoop Job。
注意: MyDriver 扩展了 Configured 并实现了 Tool。而且,要运行你的工作,请使用这个
hadoop -jar somename.jar MyDriver -D your.property=value arg1 arg2
有关更多信息,请查看此链接。
这是我为您准备的一些示例代码:
public class MyDriver extends Configured implements Tool {
public static class MyDriverMapper extends Mapper<LongWritable, Text, LongWritable, NullWritable> {
protected void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
// In the mapper you can retrieve any configuration you've set
// while starting the job from the terminal as shown below
Configuration conf = context.getConfiguration();
String yourPropertyValue = conf.get("your.property");
}
}
public static class MyDriverReducer extends Reducer<LongWritable, NullWritable, LongWritable, NullWritable> {
protected void reduce(LongWritable key, Iterable<NullWritable> values, Context context)
throws IOException, InterruptedException {
// --- some code ---
}
}
public static void main(String[] args) throws Exception {
int exitCode = ToolRunner.run(new MyDriver(), args);
System.exit(exitCode);
}
@Override
public int run(String[] args) throws Exception {
Configuration conf = getConf();
// if you want you can get/set to conf here too.
// your.property can also be file location and after
// you retrieve the properties and set them one by one to conf object.
// --other code--//
Job job = new Job(conf, "My Sample Job");
// --- other code ---//
return (job.waitForCompletion(true) ? 0 : 1);
}
}
于 2013-07-20T08:47:37.660 回答