java - 如何在 Cloudera hadoop 中使用外部 jar？

Question

我的集群上安装了 cloudera hadoop 版本 4。它与 google protobuffer jar 版本 2.4 一起打包。在我的应用程序代码中，我使用使用 protobuffer 2.5 版编译的 protobuffer 类。

这会在运行时导致未解决的编译问题。有没有办法使用外部 jar 运行 map reduce 作业，或者我被卡住直到 cloudera 升级他们的服务？

谢谢。

score 2 · Accepted Answer

Yes you can run MR jobs with external jars.

Be sure to add any dependencies to both the HADOOP_CLASSPATH and -libjars upon submitting a job like in the following examples:

You can use the following to add all the jar dependencies from current and lib directories:

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:`echo *.jar`:`echo lib/*.jar | sed 's/ /:/g'`

Bear in mind that when starting a job through hadoop jar you'll need to also pass it the jars of any dependencies through use of -libjars. I like to use:

hadoop jar <jar> <class> -libjars `echo ./lib/*.jar | sed 's/ /,/g'` [args...]

NOTE: The sed commands require a different delimiter character; the HADOOP_CLASSPATH is : separated and the -libjars need to be , separated.

EDIT: If you need your classpath to be interpreted first to ensure your jar (and not the pre-packaged jar) is the one that gets used, you can set the following:

export HADOOP_USER_CLASSPATH_FIRST=true

java - 如何在 Cloudera hadoop 中使用外部 jar？

1 回答 1

Related

Reference