c++ - What does (.eh) mean in nm output?

Question

When I look at the symbols in my library, nm mylib.a, I see some duplicate entries that look like this:

000000000002d130 S __ZN7quadmat11SpAddLeavesC1EPNS_14BlockContainerEPy
00000000000628a8 S __ZN7quadmat11SpAddLeavesC1EPNS_14BlockContainerEPy.eh

When piped through c++filt:

000000000002d130 S quadmat::SpAddLeaves::SpAddLeaves(quadmat::BlockContainer*, unsigned long long*)
00000000000628a8 S quadmat::SpAddLeaves::SpAddLeaves(quadmat::BlockContainer*, unsigned long long*) (.eh)

What does that .eh mean, and what is this extra symbol used for?

I see it has something to do with exception handling. But why does that use an extra symbol?

(I'm noticing this with clang)

score 4 · Accepted Answer

这是一些简单的代码：

bool extenrnal_variable;

int f(...)
{
    if (extenrnal_variable)
        throw 0;

    return 42;
}

int g()
{
    return f(1, 2, 3);
}

我添加extenrnal_variable以防止编译器优化所有分支。f必须...防止内联。

编译时：

$ clang++ -S -O3 -m32 -o - eh.cpp | c++filt

g()它为（省略其余部分）发出以下代码：

g():                                 ## @_Z1gv
    .cfi_startproc
## BB#0:
    pushl   %ebp
Ltmp9:
    .cfi_def_cfa_offset 8
Ltmp10:
    .cfi_offset %ebp, -8
    movl    %esp, %ebp
Ltmp11:
    .cfi_def_cfa_register %ebp
    subl    $24, %esp
    movl    $3, 8(%esp)
    movl    $2, 4(%esp)
    movl    $1, (%esp)
    calll   f(...)
    movl    $42, %eax
    addl    $24, %esp
    popl    %ebp
    ret
    .cfi_endproc

所有这些.cfi_*指令都用于在抛出异常的情况下展开堆栈。它们都编译成 FDE（帧描述条目）块并保存在g().eh（__Z1gv.eh损坏的）名称下。这些指令指定 CPU 寄存器在堆栈上的保存位置。当抛出异常并且正在展开堆栈时，不应执行函数中的代码（局部变量的析构函数除外），但应恢复先前保存的寄存器。这些表准确地存储了这些信息。

这些表可以通过dwarfdump工具转储：

$ dwarfdump --eh-frame --english eh.o | c++filt

输出：

0x00000018: FDE
        length: 0x00000018
   CIE_pointer: 0x00000000
    start_addr: 0x00000000 f(...)
    range_size: 0x0000004d (end_addr = 0x0000004d)
  Instructions: 0x00000000: CFA=esp+4     eip=[esp]
                0x00000001: CFA=esp+8     ebp=[esp]  eip=[esp+4]
                0x00000003: CFA=ebp+8     ebp=[ebp]  eip=[ebp+4]
                0x00000007: CFA=ebp+8     ebp=[ebp]  esi=[ebp-4]  eip=[ebp+4]

0x00000034: FDE
        length: 0x00000018
   CIE_pointer: 0x00000000
    start_addr: 0x00000050 g()
    range_size: 0x0000002c (end_addr = 0x0000007c)
  Instructions: 0x00000050: CFA=esp+4     eip=[esp]
                0x00000051: CFA=esp+8     ebp=[esp]  eip=[esp+4]
                0x00000053: CFA=ebp+8     ebp=[ebp]  eip=[ebp+4]

在这里您可以找到有关此块的格式。这里有更多和一些替代更紧凑的方式来表示相同的信息。基本上，这个块描述了在堆栈展开期间哪些寄存器以及从堆栈上弹出的位置。

要查看这些符号的原始内容，您可以列出所有符号及其偏移量：

$ nm -n eh.o

00000000 T __Z1fz
         U __ZTIi
         U ___cxa_allocate_exception
         U ___cxa_throw
00000050 T __Z1gv
000000a8 s EH_frame0
000000c0 S __Z1fz.eh
000000dc S __Z1gv.eh
000000f8 S _extenrnal_variable

然后转储该(__TEXT,__eh_frame)部分：

$ otool -s __TEXT __eh_frame eh.o

eh.o:
Contents of (__TEXT,__eh_frame) section
000000a8    14 00 00 00 00 00 00 00 01 7a 52 00 01 7c 08 01
000000b8    10 0c 05 04 88 01 00 00 18 00 00 00 1c 00 00 00
000000c8    38 ff ff ff 4d 00 00 00 00 41 0e 08 84 02 42 0d
000000d8    04 44 86 03 18 00 00 00 38 00 00 00 6c ff ff ff
000000e8    2c 00 00 00 00 41 0e 08 84 02 42 0d 04 00 00 00

通过匹配偏移量，您可以看到每个符号的编码方式。

当存在局部变量时，它们必须在堆栈展开期间被销毁。为此，通常会在函数本身中嵌入更多代码，并创建一些额外的更大的表。您可以通过将具有非平凡析构函数的局部变量添加到g、编译和查看程序集输出来自己探索。

进一步阅读

score 2 · Accepted Answer

它代表异常处理程序，通常与以下信息相关联：

如果您正在使用导出列表并构建共享库或将与 ld 的 -bundle_loader 标志一起使用的可执行文件，则需要在导出的 C++ 符号的导出列表中包含异常帧信息的符号。否则，它们可能会被剥离。这些符号以 .eh 结尾；您可以使用 nm 工具查看它们。

来自 XcodeUserGuide20

c++ - What does (.eh) mean in nm output?

2 回答 2

Related

Reference