18

I've seen many manifestations of ways to use the user class path as precedent to the hadoop one. Often times this is done if an m/r job needs a specific version of a library that hadoop coincidentally already uses an older version of (for example jackson's json parser or commons http , etc.)

In any case : I've seen :

mapreduce.task.classpath.user.precedence
mapreduce.task.classpath.first
mapreduce.job.user.classpath.first

Which one of these parameters is the right one to set in my job configuration, in order to force mappers and reducers to have a class path which puts my user defined hadoop_classpath jars BEFORE the hadoop default dependency jars ?

By the way, this is related to this question : Dynamodb requestHandler acception which I recently have found is due to a jar conflict.

4

5 回答 5

5

因此,假设您使用的是 0.20.203,这将在TaskRunner.java代码中处理,如下所示:

  • 您正在寻找的房产位于第 94 行 -mapreduce.user.classpath.first
  • 第 214 行是调用构建类路径列表的地方,它委托给一个名为getClassPaths(..)
  • getClassPaths()在第 524 行定义,你应该可以看到配置属性是用来决定你的 job + dist 缓存库,还是 hadoop 库首先进入类路径

对于其他版本的hadoop,您最好检查TaskRunner.java类以确认配置属性的名称,毕竟这是一个“半隐藏配置”

static final String MAPREDUCE_USER_CLASSPATH_FIRST =
        "mapreduce.user.classpath.first"; //a semi-hidden config
于 2012-07-28T00:33:31.687 回答
4

与最新的 Hadoop 版本(2.2+)一样,您应该设置:

    conf.setBoolean(MRJobConfig.MAPREDUCE_JOB_USER_CLASSPATH_FIRST, true);
于 2014-05-19T10:25:12.247 回答
3

这些设置仅适用于在您的映射器或减速器任务中引用外部 jar 类。但是,如果您在自定义 InputFormat 中使用这些,它将无法加载该类。确保这也适用于任何地方(在 MR2 中)的一种方法是在提交作业时导出此设置:

export HADOOP_USER_CLASSPATH_FIRST=true
于 2014-11-05T20:31:34.230 回答
1

我遇到了同样的问题,在 Hadoop 版本 0.20.2-cdhu03 上对我有用的参数是“mapreduce.task.classpath.user.precedence”

此设置经测试不适用于 CDH3U3,以下答案来自 Cloudera 团队:

// JobConf job = new JobConf(getConf(), MyJob.class);
// job.setUserClassesTakesPrecedence(true);

http://archive.cloudera.com/cdh/3/hadoop/api/org/apache/hadoop/mapred/JobConf.html#setUserClassesTakesPrecedence%28boolean%29

于 2012-08-24T16:13:28.653 回答
0

在 MapR 发行版中,属性为“mapreduce.task.classpath.user.precedence”
http://www.mapr.com/doc/display/MapR/mapred-site.xml

<property>
    <name>mapreduce.task.classpath.user.precedence</name>
    <value>true</value>
    <description>Set to true if user wants to set different classpath. (AVRO) </description>
</property>

jobConf.setUserClassesTakesPrecedence(true);

于 2013-06-26T07:58:50.713 回答