20

OS: GNU/Linux
Distro: OpenSuSe 13.1
Arch: x86-64
GDB version: 7.6.50.20130731-cvs
Program language: mostly C with minor bits of assembly

Imagine that I've got rather big program that sometimes fails to open a file. Is it possible to set breakpoint in GDB in such way that it stops after open(2) syscall returns -1?

Of course, I can grep through the source code and find all open(2) invocations and narrow down the faulting open() call but maybe there's a better way.

I tried to use "catch syscall open" then "condition N if $rax==-1" but obviously it didn't get hit.
BTW, Is it possible to distinct between a call to syscall (e.g. open(2)) and return from syscall (e.g. open(2)) in GDB?

As a current workaround I do the following:

  1. Run the program in question under the GDB
  2. From another terminal launch systemtap script:

    stap -g -v -e 'probe process("PATH to the program run under GDB").syscall.return { if( $syscall == 2 && $return <0) raise(%{ SIGSTOP %}) }'
    
  3. After open(2) returns -1 I receive SIGSTOP in GDB session and I can debug the issue.

TIA.

Best regards,
alexz.

UPD: Even though I tried the approach suggested by n.m before and wasn't able to make it work I decided to give it another try. After 2 hours it now works as intended. But with some weird workaround:

  1. I still can't distinct between call and return from syscall
  2. If I use finish in comm I can't use continue, which is OK according to GDB docs
    i.e. the following does drop to gdb prompt on each break:

    gdb> comm
    gdb> finish
    gdb> printf "rax is %d\n",$rax
    gdb> cont
    gdb> end
    
  3. Actually I can avoid using finish and check %rax in commands but in this case I have to check for -errno rather than -1 e.g. if it's "Permission denied" then I have to check for "-13" and if it's "No such file or direcory" - then for -2. It's just simply not right

  4. So the only way to make it work for me was to define custom function and use it in the following way:

    (gdb) catch syscall open
    Catchpoint 1 (syscall 'open' [2]
    (gdb) define mycheck
    Type commands for definition of "mycheck".
    End with a line saying just "end".
    >finish
    >finish
    >if ($rax != -1)
     >cont
     >end
    >printf "rax is %d\n",$rax
    >end
    (gdb) comm
    Type commands for breakpoint(s) 1, one per line.
    End with a line saying just "end".
    >mycheck
    >end
    (gdb) r
    The program being debugged has been started already.
    Start it from the beginning? (y or n) y
    Starting program: /home/alexz/gdb_syscall_test/main
    .....
    Catchpoint 1 (returned from syscall open), 0x00007ffff7b093f0 in __open_nocancel () from /lib64/libc.so.6
    0x0000000000400756 in main (argc=1, argv=0x7fffffffdb18) at main.c:24
    24                      fd = open(filenames[i], O_RDONLY);
    Opening test1
    fd = 3 (0x3)
    Successfully opened test1
    
    Catchpoint 1 (call to syscall open), 0x00007ffff7b093f0 in __open_nocancel () from /lib64/libc.so.6
    rax is -38
    
    Catchpoint 1 (returned from syscall open), 0x00007ffff7b093f0 in __open_nocancel () from /lib64/libc.so.6
    0x0000000000400756 in main (argc=1, argv=0x7fffffffdb18) at main.c:24
    ---Type <return> to continue, or q <return> to quit---
    24                      fd = open(filenames[i], O_RDONLY);
    rax is -1
    (gdb) bt
    #0  0x0000000000400756 in main (argc=1, argv=0x7fffffffdb18) at main.c:24
    (gdb) step
    26                      printf("Opening %s\n", filenames[i]);
    (gdb) info locals
    i = 1
    fd = -1
    
4

2 回答 2

10

此 gdb 脚本执行所要求的操作:

set $outside = 1
catch syscall open
commands
  silent
  set $outside = ! $outside
  if ( $outside && $rax >= 0)
    continue
  end
  if ( !$outside )
    continue
  end
  echo `open' returned a negative value\n
end

$outside变量是必需的,因为gdb在系统调用进入和系统调用退出时都停止。我们需要忽略进入事件并$rax仅在退出时检查。

于 2014-09-22T17:06:14.873 回答
6

是否可以在 GDB 中设置断点,使其在 open(2) 系统调用返回 -1 后停止?

对于这个狭窄n.m.的问题,很难比s 回答做得更好,但我认为这个问题提出的不正确。

当然,我可以通过源代码 grep 并找到所有 open(2) 调用

这是您困惑的一部分:当您调用openC 程序时,您实际上并没有执行open(2)系统调用。相反,您正在从您的 libc 调用open(3)“存根”,该存根将为open(2)您执行系统调用。

如果你想在存根即将返回时设置断点-1,那也很容易。

例子:

/* t.c */
#include <sys/stat.h>
#include <fcntl.h>

int main()
{
  int fd = open("/no/such/file", O_RDONLY);
  return fd == -1 ? 0 : 1;
}

$ gcc -g t.c; gdb -q ./a.out
(gdb) start
Temporary breakpoint 1 at 0x4004fc: file t.c, line 6.
Starting program: /tmp/a.out

Temporary breakpoint 1, main () at t.c:6
6     int fd = open("/no/such/file", O_RDONLY);
(gdb) s
open64 () at ../sysdeps/unix/syscall-template.S:82
82  ../sysdeps/unix/syscall-template.S: No such file or directory.

在这里,我们已经到达了 glibc 系统调用存根。让我们拆解它:

(gdb) disas
Dump of assembler code for function open64:
=> 0x00007ffff7b01d00 <+0>: cmpl   $0x0,0x2d74ad(%rip)        # 0x7ffff7dd91b4 <__libc_multiple_threads>
   0x00007ffff7b01d07 <+7>: jne    0x7ffff7b01d19 <open64+25>
   0x00007ffff7b01d09 <+0>: mov    $0x2,%eax
   0x00007ffff7b01d0e <+5>: syscall
   0x00007ffff7b01d10 <+7>: cmp    $0xfffffffffffff001,%rax
   0x00007ffff7b01d16 <+13>:    jae    0x7ffff7b01d49 <open64+73>
   0x00007ffff7b01d18 <+15>:    retq
   0x00007ffff7b01d19 <+25>:    sub    $0x8,%rsp
   0x00007ffff7b01d1d <+29>:    callq  0x7ffff7b1d050 <__libc_enable_asynccancel>
   0x00007ffff7b01d22 <+34>:    mov    %rax,(%rsp)
   0x00007ffff7b01d26 <+38>:    mov    $0x2,%eax
   0x00007ffff7b01d2b <+43>:    syscall
   0x00007ffff7b01d2d <+45>:    mov    (%rsp),%rdi
   0x00007ffff7b01d31 <+49>:    mov    %rax,%rdx
   0x00007ffff7b01d34 <+52>:    callq  0x7ffff7b1d0b0 <__libc_disable_asynccancel>
   0x00007ffff7b01d39 <+57>:    mov    %rdx,%rax
   0x00007ffff7b01d3c <+60>:    add    $0x8,%rsp
   0x00007ffff7b01d40 <+64>:    cmp    $0xfffffffffffff001,%rax
   0x00007ffff7b01d46 <+70>:    jae    0x7ffff7b01d49 <open64+73>
   0x00007ffff7b01d48 <+72>:    retq
   0x00007ffff7b01d49 <+73>:    mov    0x2d10d0(%rip),%rcx        # 0x7ffff7dd2e20
   0x00007ffff7b01d50 <+80>:    xor    %edx,%edx
   0x00007ffff7b01d52 <+82>:    sub    %rax,%rdx
   0x00007ffff7b01d55 <+85>:    mov    %edx,%fs:(%rcx)
   0x00007ffff7b01d58 <+88>:    or     $0xffffffffffffffff,%rax
   0x00007ffff7b01d5c <+92>:    jmp    0x7ffff7b01d48 <open64+72>
End of assembler dump.

在这里,您可以看到存根的行为不同,具体取决于程序是否具有多个线程。这与异步取消有关。

有两个系统调用指令,在一般情况下,我们需要在每个指令之后设置一个断点(但见下文)。

但是这个例子是单线程的,所以我可以设置一个条件断点:

(gdb) b *0x00007ffff7b01d10 if $rax < 0
Breakpoint 2 at 0x7ffff7b01d10: file ../sysdeps/unix/syscall-template.S, line 82.
(gdb) c
Continuing.

Breakpoint 2, 0x00007ffff7b01d10 in __open_nocancel () at ../sysdeps/unix/syscall-template.S:82
82  in ../sysdeps/unix/syscall-template.S
(gdb) p $rax
$1 = -2

瞧,open(2)系统调用返回-2,存根将转换为设置errnoENOENT在此系统上为 2)并返回-1

如果open(2)成功,则条件$rax < 0为假,GDB 将继续运行。

这正是在许多成功的系统调用中寻找一个失败的系统调用时,人们通常希望 GDB 提供的行为。

更新:

正如 Chris Dodd 所指出的,有两个系统调用,但在出错时它们都分支到相同的错误处理代码(设置 的代码errno)。因此,我们可以在 上设置一个无条件断点*0x00007ffff7b01d49,并且该断点只会在失败时触发。

这要好得多,因为条件断点在条件为假时会大大减慢执行速度(GDB 必须停止劣质断点,评估条件,如果条件为假,则恢复劣质断点)。

于 2014-09-24T03:04:50.787 回答