我正在运行 spark dataproc 作业。代码中的所有内容都完成了。我print('savedddd'); print(scores)
的代码的最后一行是我的代码,它也可以执行。
所有节点上的所有活动都变为 0。但 dataproc 作业并未结束。我的外壳打印出来22/01/13 19:29:15 INFO org.sparkproject.jetty.server.AbstractConnector: Stopped Spark@a69cfdd{HTTP/1.1, (http/1.1)}{0.0.0.0:0}
了,就是这样。终端仍然卡在那里。在作业选项卡中,作业不断显示为正在运行,我必须手动取消它。
你能帮我调试一下这个问题吗?
按照@Igor 的指示,我尝试使用jstack。我的尝试如下:
冉
sudo jps -mlvV
相关 pid 为 14961
我跑了
sudo jstack -l 14961
输出:
Full thread dump OpenJDK 64-Bit Server VM (25.292-b10 mixed mode):
"DestroyJavaVM" #657 prio=5 os_prio=0 tid=0x00007f9bd8013800 nid=0x3a83 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"pool-42-thread-1" #360 prio=5 os_prio=0 tid=0x00007f9b90582000 nid=0x3eb2 waiting on condition [0x00007f9b6b24b000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000094d52ac0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"yarn-scheduler-endpoint" #261 daemon prio=5 os_prio=0 tid=0x00007f9bd9d61000 nid=0x3dab waiting on condition [0x00007f9b6f185000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000088bf5dd0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"client DomainSocketWatcher" #54 daemon prio=5 os_prio=0 tid=0x00007f9b919d1800 nid=0x3ae3 runnable [0x00007f9b84ee1000]
java.lang.Thread.State: RUNNABLE
at org.apache.hadoop.net.unix.DomainSocketWatcher.doPoll0(Native Method)
at org.apache.hadoop.net.unix.DomainSocketWatcher.access$900(DomainSocketWatcher.java:52)
at org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:503)
at java.lang.Thread.run(Thread.java:748)
"Service Thread" #7 daemon prio=9 os_prio=0 tid=0x00007f9bd80cb800 nid=0x3a8c runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007f9bd80c7000 nid=0x3a8b waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007f9bd80c4000 nid=0x3a8a waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007f9bd80c1000 nid=0x3a89 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f9bd808b800 nid=0x3a88 in Object.wait() [0x00007f9bc5e87000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
- locked <0x00000000881de630> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:216)
"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f9bd8087000 nid=0x3a87 in Object.wait() [0x00007f9bc5f88000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x00000000881de800> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
"VM Thread" os_prio=0 tid=0x00007f9bd807d800 nid=0x3a86 runnable
"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f9bd8029000 nid=0x3a84 runnable
"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f9bd802a800 nid=0x3a85 runnable
"VM Periodic Task Thread" os_prio=0 tid=0x00007f9bd80ce800 nid=0x3a8d waiting on condition
JNI global references: 7697
Heap
PSYoungGen total 542720K, used 445168K [0x00000000d8000000, 0x00000000ff300000, 0x0000000100000000)
eden space 439296K, 93% used [0x00000000d8000000,0x00000000f0f809d0,0x00000000f2d00000)
from space 103424K, 34% used [0x00000000f8e00000,0x00000000fb13b870,0x00000000ff300000)
to space 99328K, 0% used [0x00000000f2d00000,0x00000000f2d00000,0x00000000f8e00000)
ParOldGen total 542208K, used 248894K [0x0000000088000000, 0x00000000a9180000, 0x00000000d8000000)
object space 542208K, 45% used [0x0000000088000000,0x000000009730f8c8,0x00000000a9180000)
Metaspace used 145492K, capacity 160872K, committed 161152K, reserved 1189888K
class space used 19087K, capacity 20303K, committed 20352K, reserved 1048576K
2022-01-20 19:48:51
Full thread dump OpenJDK 64-Bit Server VM (25.292-b10 mixed mode):