2

考虑以下 C++ 程序。我希望调用的第一个线程exit将终止程序。这就是我用g++ -g test.cxx -lpthread. 但是,当我链接到 TCMalloc ( g++ -g test.cxx -lpthread -ltcmalloc) 时,它会挂起。为什么?

检查堆栈帧表明,第一个调用的线程exit卡在__unregister_atfork等待某种引用计数变量达到 0。由于它之前获得了互斥锁,所有其他线程都陷入了死锁。我的猜测是 betweek tcmalloc 的 atfork 处理程序和我的代码之间存在某种交互。

使用 gperftools 2.0 在 CentOS 6.4 上测试。

$ cat test.cxx
#include <unistd.h>
#include <iostream>
#include <pthread.h>
#include <stdlib.h>

using namespace std;

static pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;

static void* task(void*) {
    if (fork() == 0)
        return NULL;

    pthread_mutex_lock(&m);
    exit(0);
}

int main(int argc, char **argv) {
    cout << getpid() << endl;

    pthread_t t;
    for (unsigned i = 0; i < 100; ++i) {
        pthread_create(&t, NULL, task, NULL);
    }

    sleep(9999);
}

$ g++ -g test.cxx -lpthread && $ ./a.out 
19515

$ g++ -g test.cxx -lpthread -ltcmalloc && ./a.out                             
24252
<<< process hangs indefinitely >>>
^C

$ pstack 24252
Thread 101 (Thread 0x7ffaabdf7700 (LWP 24253)):
#0  0x000000328c4f84c4 in __unregister_atfork () from /lib64/libc.so.6
#1  0x00007ffaac02d2c6 in __do_global_dtors_aux () from /usr/lib64/libtcmalloc.so.4
#2  0x0000000000000000 in ?? ()
Thread 100 (Thread 0x7ffaab3f6700 (LWP 24254)):
#0  0x000000328cc0e054 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x000000328cc09388 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x000000328cc09257 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000000000400abf in task(void*) ()
#4  0x000000328cc07851 in start_thread () from /lib64/libpthread.so.0
#5  0x000000328c4e894d in clone () from /lib64/libc.so.6
<<< the other 98 threads are also deadlocked >>>
Thread 1 (Thread 0x7ffaabdf9740 (LWP 24252)):
#0  0x000000328c4acbcd in nanosleep () from /lib64/libc.so.6
#1  0x000000328c4aca40 in sleep () from /lib64/libc.so.6
#2  0x0000000000400b33 in main ()

编辑:我认为问题可能exit是不是线程安全的。根据POSIXexit是线程安全的。但是,glibc 文档指出这不是exit线程安全的。

4

0 回答 0