-1

两次不同的所以 dlopen 和 dlclose 几次,在 dlopen 上被阻止。

挂起dlopen,不输出任何内容,cpuidle 降至 0%,并且无法通过ctrl+c退出。

LOG_TRACE("attaching...");
handle = dlopen(plugin_path.c_str(), RTLD_LAZY);
LOG_DEBUG("dlopen called");     // this line did not output, after success couple of times;

然后我使用 gdb attach 到该过程:

(gdb) bt
#0  0x0000002a960dbe60 in tcmalloc::ThreadCache::InitTSD () at src/thread_cache.cc:321
#1  0x0000002a960d51bf in TCMallocGuard (this=Variable "this" is not available.) at src/tcmalloc.cc:908
#2  0x0000002a960d5e00 in global constructors keyed to _ZN61FLAG__namespace_do_not_use_directly_use_DECLARE_int64_instead43FLAGS_tcmalloc_large_alloc_report_thresholdE () at src/tcmalloc.cc:935
#3  0x0000002a960fafc6 in __do_global_ctors_aux () at ./src/base/spinlock.h:54
#4  0x0000002a96010f13 in _init () from ../plugins/libmonitor.so
#5  0x0000002a00000000 in ?? ()
#6  0x000000302ad0acaf in _dl_init_internal () from /lib64/ld-linux-x86-64.so.2
#7  0x000000302aff725c in dl_open_worker () from /lib64/tls/libc.so.6
#8  0x000000302ad0aa60 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
#9  0x000000302aff79fa in _dl_open () from /lib64/tls/libc.so.6
#10 0x000000302b201054 in dlopen_doit () from /lib64/libdl.so.2
#11 0x000000302ad0aa60 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
#12 0x000000302b201552 in _dlerror_run () from /lib64/libdl.so.2
#13 0x000000302b201092 in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2
#14 0x000000000041b559 in uap::meta::MetaManageServiceHandler::plugin_action this=0xb26000, _return=@0x7fbffff500, plugin_name=@0x7fbffff4e0, plugin_path=@0x7fbffff570, t=Variable "t" is not available.)
at /usr/lib/gcc/x86_64-redhat-linux/3.4.5/../../../../include/c++/3.4.5/bits/basic_string.h:1456
#15 0x000000000041b0bc in uap::meta::MetaManageServiceHandler::plugin_action (this=0xb26000, _return=@0x7fbffff500, plugin_name=@0x7fbffff4e0, plugin_path=@0x7fbffff570, t=uap::meta::PluginActionType::RELOAD)
at server/service_handler.cpp:173
#16 0x0000000000417641 in uap::meta::test_Service_Handler_suite_test_case_manage_service_plugin_action_Test::TestBody (this=0xb16080) at test_load.cpp:73
#17 0x00000000004446c6 in testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (object=0xb16080, method={__pfn = 0x21, __delta = 0}, location=0x537f30 "the test body")
at ../../../../com/btest/gtest/src/gtest.cc:2744
#18 0x000000000042dd1c in testing::Test::Run (this=0xb16080) at ../../../../com/btest/gtest/src/gtest.cc:2766
#19 0x000000000042e8b4 in testing::TestInfo::Run (this=0xb17160) at ../../../../com/btest/gtest/src/gtest.cc:2958
#20 0x000000000042f415 in testing::TestCase::Run (this=0xb23000, runtype=0) at ../../../../com/btest/gtest/src/gtest.cc:3160
#21 0x0000000000436352 in testing::internal::UnitTestImpl::RunAllTests (this=0xb22000) at ../../../../com/btest/gtest/src/gtest.cc:5938
#22 0x0000000000434299 in testing::UnitTest::Run (this=0x6f4220, run_type=0) at ../../../../com/btest/gtest/src/gtest.cc:5449
#23 0x0000000000434268 in testing::UnitTest::Run (this=0x6f4220) at ../../../../com/btest/gtest/src/gtest.cc:5387
#24 0x0000000000455404 in main (argc=1, argv=0x7fbffff8c8) at ../../../../com/btest/gtest/src/gtest_main.cc:38

实际上我已经重新定义了四个函数:

void __attribute__((constructor)) dlinit()                                                                                                                                                                   
{
}

void __attribute__((destructor)) dldeinit()
{
}

void _init()
{
}

void _fini()
{
}
4

2 回答 2

1

我想我找到了根本原因:在 gdb info 中,挂起来自 tcmalloc,我已经阅读了 tcmalloc 相关代码和几个锁,然后我编译和链接,所以没有 tcmalloc,什么都没有发生,这将是 tcmalloc 错误和so一起工作

于 2013-07-05T07:36:16.120 回答
0

gcc -Wall -g您应该使用调试器编译您的应用程序和插件gdb(不要忘记编译插件源代码-fPIC并将其目标文件链接到-shared)。

您可能知道,dlopen-ing 共享对象将运行具有constructor 函数属性的函数(而且,正如dlopen(3)所说,过时的_init函数)。此外,C++ 静态数据的构造函数具有该constructor属性。

我猜这些构造函数中的一些被某种方式阻塞了(可能在输入时)。你也可以strace你的程序。

这种阻塞可能还有其他一些原因,例如 -dlopen从无响应的 NFS 服务器安装 NFS 文件等...

另请参见rtld-audit(7)ld.so(8)LD_DEBUG环境变量(尝试将其设置为all)。此外,ldd在插件和程序上运行。

\n顺便说一句,在您的代码中,格式字符串中缺少终止换行符printf是可疑的(并且味道不好),您应该dlerror()dlopen失败时打印。至少fflush(NULL);在您的代码之后添加一个调用。尝试编写代码:

handle = dlopen(plugin_path.c_str(), RTLD_LAZY);
if(!handle) { 
    printf("dlopening %s failed %s\n", plugin_path.c_str(), dlerror());
} else { 
    printf("dlopen %s success\n", plugin_path.c_str());
}
fflush(NULL);

您可能还损坏了堆(程序中的其他位置),以至于dlopen(或您的插件)无法再工作。使用valgrind寻找内存损坏错误!

于 2013-07-04T04:40:11.217 回答