0

我对在 Hadoop 中构建和执行我的第一份工作感到非常困惑,并且希望任何能够澄清我所看到的错误并提供指导的人的帮助:)

我有一个已编译的 JAR 文件。当我尝试在 OSX 中使用它执行 M/R 作业时,我得到了通常与 HADOOP_OPTS 环境变量相关的 SCDynamicStore 错误。但是,当我从示例 JAR 文件运行示例时,这不会发生。我在 hadoop-env.sh 中设置了变量,它似乎在集群中被识别。

从 hadoop-examples.jar 运行测试有效:

$ hadoop jar /usr/local/Cellar/hadoop/1.1.2/libexec/hadoop-examples-1.1.2.jar wordcount /stock/data /stock/count.out
13/06/22 13:21:51 INFO input.FileInputFormat: Total input paths to process : 3
13/06/22 13:21:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/06/22 13:21:51 WARN snappy.LoadSnappy: Snappy native library not loaded
13/06/22 13:21:51 INFO mapred.JobClient: Running job: job_201306221315_0003
13/06/22 13:21:52 INFO mapred.JobClient:  map 0% reduce 0%
13/06/22 13:21:56 INFO mapred.JobClient:  map 66% reduce 0%
13/06/22 13:21:58 INFO mapred.JobClient:  map 100% reduce 0%
13/06/22 13:22:04 INFO mapred.JobClient:  map 100% reduce 33%
13/06/22 13:22:05 INFO mapred.JobClient:  map 100% reduce 100%
13/06/22 13:22:05 INFO mapred.JobClient: Job complete: job_201306221315_0003
...

使用我自己的课程运行作业不起作用:

$ hadoop jar test.jar mapreduce.X /data /output
13/06/22 13:38:36 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/06/22 13:38:36 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/06/22 13:38:36 WARN snappy.LoadSnappy: Snappy native library not loaded
13/06/22 13:38:36 INFO mapred.FileInputFormat: Total input paths to process : 3
13/06/22 13:38:36 INFO mapred.JobClient: Running job: job_201306221328_0002
13/06/22 13:38:37 INFO mapred.JobClient:  map 0% reduce 0%
13/06/22 13:38:44 INFO mapred.JobClient: Task Id : attempt_201306221328_0002_m_000000_0, Status : FAILED
java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 9 more
Caused by: java.lang.NoClassDefFoundError: com/google/gson/TypeAdapterFactory
    at mapreduce.VerifyMarket$Map.<clinit>(VerifyMarket.java:26)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:249)
    at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:802)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:847)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:873)
    at org.apache.hadoop.mapred.JobConf.getMapperClass(JobConf.java:947)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    ... 14 more
Caused by: java.lang.ClassNotFoundException: com.google.gson.TypeAdapterFactory
    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
    ... 22 more

attempt_201306221328_0002_m_000000_0: 2013-06-22 13:38:39.314 java[60367:1203] Unable to load realm info from SCDynamicStore
13/06/22 13:38:44 INFO mapred.JobClient: Task Id : attempt_201306221328_0002_m_000001_0, Status : FAILED
... (This repeats a few times, but hopefully this is enough to see what I mean.)

最初,我认为这与上述环境变量有关,但现在我不太确定。也许我错误地包装了我的 JAR?

4

1 回答 1

0

最简单的答案是将项目转换为 Maven 并gson在 POM 中包含一个依赖项。现在mvn package选择所有必要的依赖项并创建一个 JAR 文件,其中包含在集群中完成作业所需的所有内容。

于 2013-10-24T09:33:31.713 回答