linux - 问号“？”是什么意思？在 Linux 内核恐慌调用跟踪？

Question

呼叫跟踪包含这样的条目：

 [<deadbeef>] FunctionName+0xAB/0xCD [module_name]
 [<f00fface>] ? AnotherFunctionName+0x12/0x40 [module_name]
 [<deaffeed>] ClearFunctionName+0x88/0x88 [module_name]

“？”是什么意思？在 AnotherFunctionName 之前标记？

score 37 · Accepted Answer

“？” 意味着有关此堆栈条目的信息可能不可靠。

堆栈输出机制（参见dump_trace() 函数的实现）无法证明它找到的地址是调用堆栈中的有效返回地址。

“？” 本身由printk_stack_address()输出。

堆栈条目可能有效或无效。有时可能会简单地跳过它。调查相关模块的反汇编以查看在ClearFunctionName+0x88（或者，在 x86 上，紧接在该位置之前）调用了哪个函数可能会有所帮助。

关于可靠性

在 x86 上，当调用 dump_stack() 时，实际检查堆栈的函数是 print_context_stack ()中定义的arch/x86/kernel/dumpstack.c。看一下它的代码，我将在下面尝试解释它。

我假设 DWARF2 堆栈展开工具在您的 Linux 系统中不可用（如果不是 OpenSUSE 或 SLES，它们很可能不可用）。在这种情况下，print_context_stack()似乎执行以下操作。

它从一个地址（代码中的“堆栈”变量）开始，该地址保证是堆栈位置的地址。它实际上是一个局部变量的地址dump_stack()。

该函数重复增加该地址 ( while (valid_stack_ptr ...) { ... stack++}) 并检查它指向的内容是否也可能是内核代码 ( if (__kernel_text_address(addr)) ...) 中的地址。这样，当调用这些函数时，它会尝试查找压入堆栈的函数的返回地址。

当然，并不是每个看起来像返回地址的 unsigned long 值实际上都是返回地址。所以该函数试图检查它。如果内核代码中使用了帧指针（如果设置了 CONFIG_FRAME_POINTER，则使用 %ebp/%rbp 寄存器），它们可用于遍历函数的堆栈帧。函数的返回地址位于帧指针的正上方（即 at %ebp/%rbp + sizeof(unsigned long)）。print_context_stack 正是检查这一点。

If there is a stack frame for which the value 'stack' points to is the return address, the value is considered a reliable stack entry. ops->address will be called for it with reliable == 1, it will eventually call printk_stack_address() and the value will be output as a reliable call stack entry. Otherwise the address will be considered unreliable. It will be output anyway but with '?' prepended.

[NB] If frame pointer information is not available (e.g. like it was in Debian 6 by default), all call stack entries will be marked as unreliable for this reason.

The systems with DWARF2 unwinding support (and with CONFIG_STACK_UNWIND set) is a whole another story.

linux - 问号“？”是什么意思？在 Linux 内核恐慌调用跟踪？

1 回答 1

Related

Reference