c - 在运行时检查机器指令

Question

我正在尝试在运行时打印函数的操作码。为此，我编写了一个 C 程序，它应该在该地址打印地址和十六进制数据。这里它尝试打印 mul 函数的内容。

#include <stdio.h>

int add(int a, int b)
{
    printf("Adding..\n");
    return a+b;
}

int sub(int a, int b)
{
    printf("Subtracting...\n");
    return a-b;
}

int mul(int a, int b)
{
    printf("Multiplying...\n");
    return add(a,b) * sub(a,b);
}

int main()
{
    char *ptr;
    int i;
    char a;

    int (*func)(int,int);

    mul(4,3);
    func = &mul;
    ptr = (char *)func;

    do
    {
        a = *ptr;
        printf("%p %x\n",ptr,a);
        ptr++;
    }while (a != 0xffffffc3); 
    //op code for ret is c3, which specifies end of function
    //however, i am not certain why it opcode is being padded by 0xffffff
}

它给出的输出是

Multiplying...
Adding..
Subtracting...
0x4005a4 55
0x4005a5 48
0x4005a6 ffffff89
0x4005a7 ffffffe5
0x4005a8 53
0x4005a9 48
0x4005aa ffffff83
0x4005ab ffffffec
0x4005ac 18
0x4005ad ffffff89
0x4005ae 7d
0x4005af ffffffec
0x4005b0 ffffff89
0x4005b1 75
0x4005b2 ffffffe8
0x4005b3 ffffffbf
0x4005b4 c
0x4005b5 7
0x4005b6 40
0x4005b7 0
0x4005b8 ffffffe8
0x4005b9 63
0x4005ba fffffffe
0x4005bb ffffffff
0x4005bc ffffffff
0x4005bd ffffff8b
0x4005be 55
0x4005bf ffffffe8
0x4005c0 ffffff8b
0x4005c1 45
0x4005c2 ffffffec
0x4005c3 ffffff89
0x4005c4 ffffffd6
0x4005c5 ffffff89
0x4005c6 ffffffc7
0x4005c7 ffffffe8
0x4005c8 ffffff90
0x4005c9 ffffffff
0x4005ca ffffffff
0x4005cb ffffffff
0x4005cc ffffff89
0x4005cd ffffffc3

输出几乎如我所愿，但一些操作码在左侧被 0xffffff 填充，并被读取为负值。为什么会这样？

ELF文件的objdump如下

 00000000004005a4 <mul>:
  4005a4:   55                      push   %rbp
  4005a5:   48 89 e5                mov    %rsp,%rbp
  4005a8:   53                      push   %rbx
  4005a9:   48 83 ec 18             sub    $0x18,%rsp
  4005ad:   89 7d ec                mov    %edi,-0x14(%rbp)
  4005b0:   89 75 e8                mov    %esi,-0x18(%rbp)
  4005b3:   bf 0c 07 40 00          mov    $0x40070c,%edi
  4005b8:   e8 63 fe ff ff          callq  400420 <puts@plt>
  4005bd:   8b 55 e8                mov    -0x18(%rbp),%edx
  4005c0:   8b 45 ec                mov    -0x14(%rbp),%eax
  4005c3:   89 d6                   mov    %edx,%esi
  4005c5:   89 c7                   mov    %eax,%edi
  4005c7:   e8 90 ff ff ff          callq  40055c <add>
  4005cc:   89 c3                   mov    %eax,%ebx
  4005ce:   8b 55 e8                mov    -0x18(%rbp),%edx
  4005d1:   8b 45 ec                mov    -0x14(%rbp),%eax
  4005d4:   89 d6                   mov    %edx,%esi
  4005d6:   89 c7                   mov    %eax,%edi
  4005d8:   e8 a1 ff ff ff          callq  40057e <sub>
  4005dd:   0f af c3                imul   %ebx,%eax
  4005e0:   48 83 c4 18             add    $0x18,%rsp
  4005e4:   5b                      pop    %rbx
  4005e5:   5d                      pop    %rbp
  4005e6:   c3                      retq

十六进制代码几乎相同，除了 0xffffff 的填充。我无法弄清楚为什么？

score 4 · Accepted Answer

这是因为在您的系统上，char已签名。改为使用unsigned char，或者（如评论中建议的那样）uint8_t如果您使用的是具有它的 C 实现。

此外，由于您无法将函数指针可移植地转换为void *，因此我认为您不能可移植地假设函数指针指向可读内存，该内存在机器代码中保存函数的表示。

我意识到这是典型的并且有点合乎逻辑，但我认为 C 不能保证它。在这种情况下，该程序将触发未定义的行为。希望它不会做任何有害的事情，并且仍然具有指导意义（双关语）。

c - 在运行时检查机器指令

1 回答 1

Related

Reference