0

我正在用 4 个奴隶启动我的 hadoop,除了一台机器外,一切正常。我以完全相同的方式创建了它们。

运行 ./start-all.sh 时收到的错误是:

xxxxx: starting tasktracker, logging to /xxxxx/xxxxx/hadoop/logs/hadoop-xxxxx-tasktracker-xxxxx.out
xxxxx: /xxxxx/xxxxx/hadoop/hadoop-0.20/bin/hadoop: line 413:  7012 Aborted                 nohup $_JAVA_EXEC -Dproc_$COMMAND $JAVA_HEAP_MAX $HADOOP_OPTS -classpath "$CLASSPATH" $CLASS "$@" >"$_HADOOP_DAEMON_OUT" 2>&1 </dev/null

日志的详细信息:

/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = xxxxx/144.99.120.153
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.20.2-cdh3u5
STARTUP_MSG:   build = file:///data/1/xxxxx/topdir/BUILD/hadoop-0.20.2-cdh3u5 -r 580d1d26c7ad6a7c6ba72950d8605e2c6fbc96cc; compiled by 'root' on Mon Aug  6 20:20:46 PDT 2012
************************************************************/
02:29:24.395 [main] DEBUG org.apache.hadoop.security.Groups -  Creating new Groups object
02:29:24.513 [main] DEBUG org.apache.hadoop.security.Groups - Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
02:29:24.529 [main] DEBUG o.a.h.security.UserGroupInformation - hadoop login
02:29:24.529 [main] DEBUG o.a.h.security.UserGroupInformation - hadoop login commit
02:29:24.531 [main] DEBUG o.a.h.security.UserGroupInformation - using local user:UnixPrincipal:xxxxx
02:29:24.533 [main] DEBUG o.a.h.security.UserGroupInformation - UGI loginUser:xxxxx (auth:SIMPLE)
02:29:24.604 [main] DEBUG org.apache.hadoop.fs.FileSystem - Creating filesystem for file:///
02:29:24.627 [main] WARN  org.apache.hadoop.util.DiskChecker - Incorrect permissions were set on /xxxxx/xxxxx/hadoop-hdfs-dev/dfs/data, expected: rwx------, while actual: rwxr-----. Fixing...
02:29:24.630 [main] DEBUG o.a.hadoop.util.NativeCodeLoader - Trying to load the custom-built native-hadoop library...
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGFPE (0x8) at pc=0x0000002a9555d827, pid=6924, tid=1076017504
#
# JRE version: 6.0_25-b06
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.0-b11 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [ld-linux-x86-64.so.2+0x7827]  double+0xe7
#
# An error report file with more information is saved as:
# /xxxxx/xxxxx/xxxxx/hadoop/hadoop-0.20/hs_err_pid6924.log
#
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.

第三个日志的详细信息是:

---------------  T H R E A D  ---------------

Current thread (0x0000000040122000):  JavaThread "main" [_thread_in_native, id=6926, stack(0x000000004012b000,0x000000004022c000)]

siginfo:si_signo=SIGFPE: si_errno=0, si_code=1 (FPE_INTDIV), si_addr=0x0000002a9555d827

Registers:
RAX=0x000000000f4d007f, RBX=0x0000002a9d2ab8b8, RCX=0x0000000040228700, RDX=0x0000000000000000
RSP=0x00000000402285e0, RBP=0x00000000402287b0, RSI=0x0000002a9b6c1080, RDI=0x0000002a9b2afe70
R8 =0x0000002a9d2ab8a8, R9 =0x0000000000000000, R10=0x0000000040228838, R11=0x0000000000000000
R12=0x0000002a9b6c1080, R13=0x0000002a9d0a633c, R14=0x0000000000000000, R15=0x0000000000000000
RIP=0x0000002a9555d827, EFLAGS=0x0000000000010246, CSGSFS=0x000000000000a940, ERR=0x0000000000000000
  TRAPNO=0x0000000000000000

Top of Stack: (sp=0x00000000402285e0)
0x00000000402285e0:   0000000040228740 0000002a955919c8
0x00000000402285f0:   0000000040228580 0000001095560de4
0x0000000040228600:   0000000000000000 000000009678c130
0x0000000040228610:   0000002a967f86b8 0000002a967f7f30
...............

Stack: [0x000000004012b000,0x000000004022c000],  sp=0x0000000040228870,  free space=1014k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [ld-linux-x86-64.so.2+0x7827]  double+0xe7
C  [ld-linux-x86-64.so.2+0x8fc0]  _dl_relocate_object+0x410
C  [libc.so.6+0xf8558]  double+0x238
C  [ld-linux-x86-64.so.2+0xae00]  _dl_catch_error+0x60
C  [libdl.so.2+0x1054]  double+0x34

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  java.lang.ClassLoader$NativeLibrary.load(Ljava/lang/String;)V+0
j  java.lang.ClassLoader.loadLibrary0(Ljava/lang/Class;Ljava/io/File;)Z+300
j  java.lang.ClassLoader.loadLibrary(Ljava/lang/Class;Ljava/lang/String;Z)V+347
j  java.lang.Runtime.loadLibrary0(Ljava/lang/Class;Ljava/lang/String;)V+54
j  java.lang.System.loadLibrary(Ljava/lang/String;)V+7
j  org.apache.hadoop.util.NativeCodeLoader.<clinit>()V+25
v  ~StubRoutines::call_stub
j  org.apache.hadoop.io.nativeio.NativeIO.<clinit>()V+17
v  ~StubRoutines::call_stub
j  org.apache.hadoop.io.ReadaheadPool.getInstance()Lorg/apache/hadoop/io/ReadaheadPool;+12
j  org.apache.hadoop.mapred.TaskTracker.<init>(Lorg/apache/hadoop/mapred/JobConf;)V+145
j  org.apache.hadoop.mapred.TaskTracker.main([Ljava/lang/String;)V+52
v  ~StubRoutines::call_stub

---------------  P R O C E S S  ---------------

Java Threads: ( => current thread )
  0x0000002a972d1000 JavaThread "Low Memory Detector" daemon [_thread_blocked, id=7029, stack(0x000000004103a000,0x000000004113b000)]
  0x0000002a972cf000 JavaThread "C2 CompilerThread1" daemon [_thread_blocked, id=7028, stack(0x0000000040f39000,0x000000004103a000)]
  0x0000002a972c9800 JavaThread "C2 CompilerThread0" daemon [_thread_blocked, id=7027, stack(0x0000000040e38000,0x0000000040f39000)]
  0x0000002a972c7800 JavaThread "Signal Dispatcher" daemon [_thread_blocked, id=7026, stack(0x0000000040d37000,0x0000000040e38000)]
  0x0000002a972a6000 JavaThread "Finalizer" daemon [_thread_blocked, id=7025, stack(0x0000000040c36000,0x0000000040d37000)]
  0x0000002a972a4000 JavaThread "Reference Handler" daemon [_thread_blocked, id=7024, stack(0x0000000040b35000,0x0000000040c36000)]
=>0x0000000040122000 JavaThread "main" [_thread_in_native, id=7014, stack(0x000000004012b000,0x000000004022c000)]

Other Threads:
  0x0000002a9729d000 VMThread [stack: 0x0000000040a34000,0x0000000040b35000] [id=7023]
  0x0000002a9b226800 WatcherThread [stack: 0x000000004113b000,0x000000004123c000] [id=7030]

VM state:not at safepoint (normal execution)

VM Mutex/Monitor currently owned by a thread: None

Heap
 PSYoungGen      total 150016K, used 18028K [0x00000000eb2b0000, 0x00000000f5a10000, 0x0000000100000000)
  eden space 128640K, 14% used [0x00000000eb2b0000,0x00000000ec44b318,0x00000000f3050000)
  from space 21376K, 0% used [0x00000000f4530000,0x00000000f4530000,0x00000000f5a10000)
  to   space 21376K, 0% used [0x00000000f3050000,0x00000000f3050000,0x00000000f4530000)
 PSOldGen        total 342848K, used 0K [0x00000000c1800000, 0x00000000d66d0000, 0x00000000eb2b0000)
  object space 342848K, 0% used [0x00000000c1800000,0x00000000c1800000,0x00000000d66d0000)
 PSPermGen       total 21248K, used 9638K [0x00000000bc600000, 0x00000000bdac0000, 0x00000000c1800000)
  object space 21248K, 45% used [0x00000000bc600000,0x00000000bcf69910,0x00000000bdac0000)

Code Cache  [0x0000002a972e7000, 0x0000002a97557000, 0x0000002a9a2e7000)
 total_blobs=321 nmethods=82 adapters=193 free_code_cache=49720000 largest_free_block=21824

我不确定这个问题有人可以帮我吗?我以完全相同的方式创建了奴隶。堆设置为 1GB。太感谢了 !

4

1 回答 1

0

您在本机代码中看到整数溢出或整数除以零(SIGFPEFPE_INTDIV. 引起)

该线程建议添加-XX:-UseCompressedOops到 jvm 启动标志可能会有所帮助。

该线程表明您的(linux)内核太旧:尝试升级到更新版本。

于 2013-03-14T03:43:57.920 回答