3

Problem Description

Hi everyone,

I was having problems remote debugging to a new server today and it initially appeared as if Eclipse was taking a long time to connect to a remote JVM. However, after some investigation I realized jdb was having a similar problem. Some digging turned up this...

Connecting to the remote JVM is not a problem. Both debuggers establish the socket connection properly within a few seconds. jdb even processes commands. However, after a remote debugger has connected to it successfully, the Sun Java 1.7.0_60-b19 JVM appears to be hanging (or transmitting extremely slowly) when sending its thread debugging information (JVMTI/JDWP) across the network via TCPIP.

Listing the remote JVM's threads appears to be the problem. JDB's threads command either hangs or executes extremely slowly. The load on the remote JVM is reasonable, and there aren't any breakpoints set. There are many threads executing on this JVM concurrently (~2005 threads), and there may be some WAN latency, but there still needs to be a way to successfully use a remote debugger with it!

Observation. I'm guessing this is related to inefficient transmission of thread information via TCPIP by the JVM's Java Debug Wire Protocol (JDWP) implementation, as the machine is on the other side of the planet on a WAN. HOWEVER -- Windows Remote Desktop Connection to the same machine is acceptably fast and performant. Given that, it seems that having to wait 45 minutes for Eclipse or JDB to list thread information on a JVM is unacceptable and likely a bug (or a very poorly implemented feature).

Possibly related to?

  1. http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6401245 - This looks like it was fixed and shouldn't be the issue anymore. Plus the JVM is running on Windows Datacenter not Linux.
  2. http://www.eclipsecon.org/2013/comment/reply/1153.html - Possible solution but requires the SAP JVM?
  3. https://github.com/vpotapev/jbreakpoint - Open source jdb interface but doesn't fix issue with Eclipse.

Question. Does anyone have any ideas how to make the thread data transmission more efficient so the JVM can be remote debugged effectively? Is this a bug in the JVM's Java Debugging Wire Protocol (JDWP)?

Problem Details

Java Version:

C:\Users\Administrator>"C:\Program Files\Java\jdk1.7.0_60\bin\java" -version
java version "1.7.0_60"
Java(TM) SE Runtime Environment (build 1.7.0_60-b19)
Java HotSpot(TM) 64-Bit Server VM (build 24.60-b09, mixed mode)

Application Server. Happens with Weblogic 10g. Also happens with IBM WebSphere 8.5

Eclipse. Eclipse remote debugging hangs indefinitely when trying to populate the Debug view with thread information.

JDB - Remote. Java's jdb remote debugger hangs for upwards of 10 minutes when listing the threads with the thread command. Then it lists them VERY slowly (1 thread per second, would take 33 minutes to list them all).

"C:\Program Files\Java\jdk1.7.0_60\bin\jdb.exe -connect com.sun.jdi.SocketAttach:hostname=xxx.yyy.com,port=7777
Set uncaught java.lang.Throwable
Set deferred uncaught java.lang.Throwable
Initializing jdb ...
> threads
Group system: <<jdb hangs here trying to get thread information>>

JDB - Local. Java's jdb remote debugger executes the threads command in 3 seconds when run on the JVM's local machine.

"C:\Program Files\Java\jdk1.7.0_60\bin\jdb.exe -connect com.sun.jdi.SocketAttach:hostname=xxx.yyy.com,port=7777
Set uncaught java.lang.Throwable
Set deferred uncaught java.lang.Throwable
Initializing jdb ...
> threads
Group system:
> threads
Group system:
  (java.lang.ref.Reference$ReferenceHandler)0x7484                                                                  Reference Handler
  (java.lang.ref.Finalizer$FinalizerThread)0x7485                                                                   Finalizer
  (java.lang.Thread)0x7486                                                                                          ...
GC Daemon
  (java.lang.Thread)0x748b                                                                                          RMI RenewClean-
...
>
4

1 回答 1

2

经过一些调查和数据包嗅探,似乎远程进程中有大量线程会导致调试数据包拥塞以及远程进程调试代理和远程调试器之间的处理延迟反馈循环。基本上,JVM 调试代理不是为高延迟连接或远程调试大量线程而设计的。它也不会尝试压缩线程/堆栈/变量更新信息以降低到调试器的传输时间。

我们的解决方案最终是在我们的产品中添加一个线程调试模式,以减少各种服务器线程池的大小,然后调试突然在海外网络上再次可以接受地响应。

未来对 Java 社区的建议是向 Eclipse 和 Java 调试代理添加线程过滤器功能,以便在每个代码步骤发出信号后,仅将感兴趣的线程的信息传输到调试器。

于 2018-03-01T01:16:17.597 回答