4

I have a exit handler thread waiting on a condition for the worker thread to do its work. The signalling is done from the worker thread's destructor.

Below is the code of the exit handler thread.

void Class::TaskExitHandler::run() throw()
{

while( ! isInterrupted() ) {

    _book->_eot_cond.wait(); // Waiting on this condition
    {
        CLASS_NAMESPACE::Guard<CLASS_NAMESPACE::FastLock> eguard(_book->_exitlist_lock);

        list<TaskGroupExecutor*>::const_iterator itr = _book->_exited_tasks.begin();

        for( ; itr != _book->_exited_tasks.end(); itr++ ) {
            (*itr)->join();
            TRACER(TRC_DEBUG)<< "Deleting exited task:" << (*itr)->getLoc() << ":"
                     << (*itr)->getTestID() << ":" << (*itr)->getReportName() << endl;
            delete (*itr);
        }
        _book->_exited_tasks.clear();
    }
    _book->executeAny();
}
}
}

Now, what has been observed is that when the worker thread catches any exception(raised from a lower layer), this thread is continued, and immediately cores with exit code 134, which is SIGABRT.

The stacktrace is as follows-

#0  0x0000005555f49b4c in raise () from /lib64/libc.so.6
#1  0x0000005555f4b568 in abort () from /lib64/libc.so.6
#2  0x0000005555d848b4 in __gnu_cxx::__verbose_terminate_handler () from /usr/lib64/libstdc++.so.6
#3  0x0000005555d82210 in ?? () from /usr/lib64/libstdc++.so.6
#4  0x0000005555d82258 in std::terminate () from /usr/lib64/libstdc++.so.6
#5  0x0000005555d82278 in ?? () from /usr/lib64/libstdc++.so.6
#6  0x0000005555d81b18 in __cxa_call_unexpected () from /usr/lib64/libstdc++.so.6
#7  0x0000000120047898 in Class::TaskExitHandler::run ()
#8  0x000000012001cd38 in commutil::ThreadBase::thread_proxy ()
#9  0x0000005555c6e438 in start_thread () from /lib64/libpthread.so.0
#10 0x0000005555feed6c in __thread_start () from /lib64/libc.so.6
Backtrace stopped: frame did not save the PC

So it seems that this run() function which specifies that it will not throw any exceptions using "throw()" spec, raises an exception(from Frame 4). As per various references about __cxa_call_unexpected(), the stacktrace depicts the typical behaviour of compiler to abort when exception is raised in a function with "throw()" spec. Am I right with the analysis of the problem?

To test, I added a try catch in this method, and printed the exception message. Now the process didn't core. The exception message was same as the one caught by worker thread. My question is, how does this thread get access to the exception caught by the other? Do they share some datastructure related to exception handling?

Please throw some light on this. It is quite puzzling..

Note:- As per stacktrace, the call_unexpected is raised immediately after run() is called. That strengthens my doubt that somehow exception stack or data is shared. But didn't find any references to this behaviour.

4

2 回答 2

1

我将回答我自己的问题。在这种情况下发生的事情是在 TaskExitHandler 线程中调用了一个析构函数。这个析构函数正在执行导致主线程异常的相同操作。

由于 TaskExitHandler 线程被设计为不抛出(或者更确切地说是预期的),因此没有 try-catch 块,因此在引发异常时进程中止。

由于析构函数的调用是隐式的,因此它从未显示在堆栈跟踪中,因此很难找到。必须跟踪每个对象以找到此异常泄漏。

感谢大家的积极参与:)这是我第一个得到积极回应的问题。

于 2012-05-03T10:03:35.870 回答
0

我会试一试——希望这会给你足够的时间来继续你的研究。

我怀疑运行 TaskExitHandler 的线程是所有工作线程的父线程。否则,TEH 将很难与孩子们一起度过。

子/工作线程不处理向它们抛出的异常。但是,必须在某处处理异常,否则整个过程将被关闭。父线程(又名 TEH)是进程堆栈/链中用于处理异常的最后一站。您的示例代码表明 TEH 的异常处理是简单地抛出/不处理异常。所以它核心了。

共享的不一定是数据结构,而是进程/线程 ID 和内存空间。子线程确实与父线程和彼此共享全局内存/堆空间,因此需要信号量和/或互斥锁来实现锁定目的。

良好的封装要求工作线程应该足够聪明以处理他们可能看到的任何/所有异常。这样,可以杀死单个工作线程,而不是关闭父线程和进程树的其余部分。OTW,您可以继续在 TEH 中捕获异常,但线程不太可能(或应该)知道如何处理异常。

如果上述内容不清楚,请添加评论,我很乐意进一步解释。

我做了一些研究并确认异常是针对堆内存而不是堆栈内存生成的。您进程的所有线程共享同一个堆*,因此(至少对我而言)为什么父线程会在子线程没有捕获异常时看到异常更有意义。*FWIW,如果你 fork 你的进程而不是启动一个新线程,你也会得到一个新的堆。但是,fork 对内存来说是一项昂贵的操作,因为您还将所有堆内容复制到新进程中。

这个 SO 线程讨论了设置一个线程来捕获所有异常,这可能会很有趣: 从另一个线程捕获异常

于 2012-05-01T18:06:26.450 回答