java - pthread_self() 中的 Pthread id 与 dtrace 脚本中的数据不匹配

Question

我正在使用这里的 dtrace 脚本来尝试查找 java 程序的线程何时发生上下文切换。

我正在尝试将从脚本收集的数据与从正在运行的程序收集的跟踪数据（方法进入/退出之类的东西）进行匹配。我使用一个简短的 JNI 方法获取正在运行的线程的 pthread id，该方法只返回 pthread_self() 的值。

我遇到的问题是我通过调用 pthread_self() 获得的线程 ID 与我在 dtrace 脚本中获得的任何线程 ID 完全不同。我想知道这是否是因为我错误地调用了 pthread_self() 因为它返回了一个指针，但是很难找到有关 pthread_t 在 mac osx 上实际是什么的信息。

score 3 · Accepted Answer

所以我会回答我自己的问题，dtrace 中的 curthread 和 tid 变量是内核线程结构的指针值，要获取这些值以将 dtrace 与用户空间线程数据进行比较，我必须创建一个内核扩展来获取这些内部值用于用户空间中的线程。

一般来说，这是一个坏主意，因为它是不可移植的，如果内核被更改，很容易崩溃，并且可能存在安全风险。不幸的是，我还没有找到另一种方法来实现我想要的。

score 2 · Accepted Answer

来自/usr/include/pthread.h：

typedef __darwin_pthread_t pthread_t;

然后从/usr/include/sys/_types.h：

struct _opaque_pthread_t {
  long __sig;
  struct __darwin_pthread_handler_rec* __cleanup_stack;
  char __opaque[__PTHREAD_SIZE__];
};
typedef struct _opaque_pthread_t* __darwin_pthread_t;

源代码是你的朋友:)

score 1 · Accepted Answer

How about something a bit more elegant using the pid provider, which deals with userland code?

# dtrace -n 'pid$target::pthread_self:return {printf("%p", arg1)}' -c 'java'
dtrace: description 'pid$target::pthread_self:return ' matched 1 probe
dtrace: pid 87631 has exited
CPU     ID                    FUNCTION:NAME
  0  90705              pthread_self:return 1053a7000
  0  90705              pthread_self:return 1054ad000
  2  90705              pthread_self:return 7fff7b479180
  2  90705              pthread_self:return 7fff7b479180
  2  90705              pthread_self:return 7fff7b479180
  2  90705              pthread_self:return 7fff7b479180
  2  90705              pthread_self:return 7fff7b479180
  4  90705              pthread_self:return 10542a000
  4  90705              pthread_self:return 10542a000

Huzzah!

arg1 refers to the return value in the probe, which in this case is a pointer. If you need the stuff it points to, use copyin(arg1, size_of_struct) and cast the result to whatever you think it is (see @Nikolai's post and don't forget you can use #include in DTrace scripts as long as you remember the -C option on the command line). The pid$target provider name expands to pid1234, where 1234 is the process id of the command executed with the -c option - in this case, java.

For more information, check out Brendan Gregg's blog (which is a great general source of dtrace info).

score 0 · Accepted Answer

在 linux 上，我发现识别进程上下文切换的最可靠方法是通过以下命令：

pidstat -hluwrt  | grep "processname"

'tid' 列 (#3) 与 'gettid()' 相同，因此允许开发人员直接关联哪个线程正在使用 CPU 和上下文切换。我建议在为程序生成线程时吐出 gettid() 值：printf("%lul",gettid())。

进程命令行之前的最后 2 列是每秒的“cswtch/s”（自愿）和“nvcswtch/s”（非自愿）上下文切换计数。

当“cswtch/s”很高（1000 秒）时，您的进程正在过度循环通过“唤醒”和“睡眠”。您可能需要考虑某种缓冲区来提供线程，从而允许更长的唤醒和睡眠时间。例如：当缓冲区未满时，线程休眠时间更长。当缓冲区变满时，线程处于唤醒状态，直到缓冲区变空。

当“nvswtch/s”很高（1000 秒）时，这表明您的系统负载很重，并且各个线程正在争用 CPU 时间。您可能想调查服务器负载、服务器上活动进程和线程的数量：“top”或“htop”是您的朋友。

我发现以下脚本对调试/优化进程线程很有用（每 20 秒输出一次）：

stdbuf -oL pidstat -hluwrt  20 | stdbuf -oL grep -e "processname" -e "^#"

gettid 的文档：（此处
的文档） pidstat 的文档：（此处
的文档） stdbuf 的文档：（此处的文档）

java - pthread_self() 中的 Pthread id 与 dtrace 脚本中的数据不匹配

4 回答 4

Related

Reference