4

OS的是RHEL 7,我运行一个简单的Go程序:

package main

import (
    "time"
)

func main() {
    time.Sleep(1000 * time.Second)
}

在其运行期间,我检查进程的线程数:

# cat /proc/13858/status | grep Thread
Threads:        5

使用pstack附带的命令时RHEL,它只打印一个线程的堆栈:

# pstack 13858
Thread 1 (process 13858):
#0  runtime.futex () at /usr/local/go/src/runtime/sys_linux_amd64.s:307
#1  0x0000000000422580 in runtime.futexsleep (addr=0x4c7af8 <runtime.timers+24>, val=0, ns=999999997446) at /usr/local/go/src/runtime/os1_linux.go:57
#2  0x000000000040b07b in runtime.notetsleep_internal (n=0x4c7af8 <runtime.timers+24>, ns=999999997446, ~r2=255) at /usr/local/go/src/runtime/lock_futex.go:174
#3  0x000000000040b1e6 in runtime.notetsleepg (n=0x4c7af8 <runtime.timers+24>, ns=999999997446, ~r2=false) at /usr/local/go/src/runtime/lock_futex.go:206
#4  0x000000000043e5de in runtime.timerproc () at /usr/local/go/src/runtime/time.go:209
#5  0x0000000000451001 in runtime.goexit () at /usr/local/go/src/runtime/asm_amd64.s:1998
#6  0x0000000000000000 in ?? ()

为什么pstack只打印一个线程的内容?

PS:pstack脚本在这里:

#!/bin/sh

if test $# -ne 1; then
    echo "Usage: `basename $0 .sh` <process-id>" 1>&2
    exit 1
fi

if test ! -r /proc/$1; then
    echo "Process $1 not found." 1>&2
    exit 1
fi

# GDB doesn't allow "thread apply all bt" when the process isn't
# threaded; need to peek at the process to determine if that or the
# simpler "bt" should be used.

backtrace="bt"
if test -d /proc/$1/task ; then
    # Newer kernel; has a task/ directory.
    if test `/bin/ls /proc/$1/task | /usr/bin/wc -l` -gt 1 2>/dev/null ; then
        backtrace="thread apply all bt"
    fi
elif test -f /proc/$1/maps ; then
    # Older kernel; go by it loading libpthread.
    if /bin/grep -e libpthread /proc/$1/maps > /dev/null 2>&1 ; then
        backtrace="thread apply all bt"
    fi
fi

GDB=${GDB:-/usr/bin/gdb}

# Run GDB, strip out unwanted noise.
# --readnever is no longer used since .gdb_index is now in use.
$GDB --quiet -nx $GDBARGS /proc/$1/exe $1 <<EOF 2>&1 |
set width 0
set height 0
set pagination no
$backtrace
EOF
/bin/sed -n \
    -e 's/^\((gdb) \)*//' \
    -e '/^#/p' \
    -e '/^Thread/p' 
4

2 回答 2

4

pstack 使用 gdb。这是来自 golang doc ( https://golang.org/doc/gdb ) 的引用:

GDB 不太了解 Go 程序。堆栈管理、线程和运行时包含与 GDB 期望的执行模型有很大不同的方面,即使程序是使用 gccgo 编译的,它们也会混淆调试器。因此,尽管 GDB 在某些情况下可能很有用,但它不是 Go 程序的可靠调试器,尤其是高度并发的程序。

您在 /proc 中看到的 5 个线程中有 4 个是在程序进入 main 之前创建的。我假设 golang 运行时创建它们。

为什么 pstack 只打印一个线程的内容?

从 gdb 的 strace 输出来看,我看到它gdb实际上试图附加到它们,但是在出现问题后并gdb没有尝试检查这些线程。这些是 gdb 为这些运行时线程发出的系统调用,但由于未知原因决定立即停止调查它们:

5072  ptrace(PTRACE_ATTACH, 5023, 0, 0) = 0
5072  --- SIGCHLD (Child exited) @ 0 (0) ---
5072  rt_sigreturn(0x11)                = 0
5072  ptrace(PTRACE_ATTACH, 5024, 0, 0) = 0
5072  --- SIGCHLD (Child exited) @ 0 (0) ---
5072  rt_sigreturn(0x11)                = 0
5072  ptrace(PTRACE_ATTACH, 5025, 0, 0) = 0
5072  --- SIGCHLD (Child exited) @ 0 (0) ---
5072  rt_sigreturn(0x11)                = 0

但是,您可以自己检查它们。看来这些线程属于golang运行时

$ pstack 5094
Thread 1 (process 5094):
#0  0x0000000000459243 in runtime.futex ()
#1  0x00000000004271e0 in runtime.futexsleep ()
#2  0x000000000040d55b in runtime.notetsleep_internal ()
#3  0x000000000040d64b in runtime.notetsleep ()
#4  0x0000000000435677 in runtime.sysmon ()
#5  0x000000000042e6cc in runtime.mstart1 ()
#6  0x000000000042e5d2 in runtime.mstart ()
#7  0x00000000004592b7 in runtime.clone ()
#8  0x0000000000000000 in ?? ()

$ pstack 5095
Thread 1 (process 5095):
#0  0x0000000000459243 in runtime.futex ()
#1  0x0000000000427143 in runtime.futexsleep ()
#2  0x000000000040d3f4 in runtime.notesleep ()
#3  0x000000000042f6eb in runtime.stopm ()
#4  0x0000000000430a79 in runtime.findrunnable ()
#5  0x00000000004310ff in runtime.schedule ()
#6  0x000000000043139b in runtime.park_m ()
#7  0x0000000000455acb in runtime.mcall ()
#8  0x000000c820021500 in ?? ()
#9  0x0000000000000000 in ?? ()

$ pstack 5096
Thread 1 (process 5096):
#0  0x0000000000459243 in runtime.futex ()
#1  0x0000000000427143 in runtime.futexsleep ()
#2  0x000000000040d3f4 in runtime.notesleep ()
#3  0x000000000042f6eb in runtime.stopm ()
#4  0x000000000042fff7 in runtime.startlockedm ()
#5  0x0000000000431147 in runtime.schedule ()
#6  0x000000000043139b in runtime.park_m ()
#7  0x0000000000455acb in runtime.mcall ()
#8  0x000000c820020000 in ?? ()

gdb 8.0 的更新

使用 gdb 8.0 的 pstack 正确打印所有线程的回溯。该命令如下所示:

$ GDB=$HOME/bin/gdb pstack  $(pidof main)

这是它的输出(缩短):

$ GDB=$HOME/bin/gdb pstack  $(pidof main) | egrep "^Thread"
Thread 4 (LWP 18335):
Thread 3 (LWP 18334):
Thread 2 (LWP 18333):
Thread 1 (LWP 18332):
于 2016-05-27T09:57:41.693 回答
0

当您将 LWP/线程 ID 传递给pstack您时,您只会获得该线程的堆栈。尝试将进程的 PID 传递给pstack,您将获得所有线程的堆栈。您可能会获得进程的 PID 或 Tgid(线程组 ID)cat /proc/13858/status | grep Tgid:. 要获取您的进程创建的所有 LWP,您可以运行ps -L <PID>

于 2016-05-27T04:38:46.480 回答