我有一个 Linux ELF 文件 a.out,我使用以下命令提取 _start 的反汇编结果
objdump -d ./a.out -F | awk -v RS= '/^[[:xdigit:]]+ <_start>/'
我得到如下输出
00000000004008e0 <_start> (File Offset: 0x8e0):
4008e0: 31 ed xor %ebp,%ebp
4008e2: 49 89 d1 mov %rdx,%r9
4008e5: 5e pop %rsi
4008e6: 48 89 e2 mov %rsp,%rdx
4008e9: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
4008ed: 50 push %rax
4008ee: 54 push %rsp
4008ef: 49 c7 c0 30 19 40 00 mov $0x401930,%r8
4008f6: 48 c7 c1 a0 18 40 00 mov $0x4018a0,%rcx
4008fd: 48 c7 c7 f0 05 40 00 mov $0x4005f0,%rdi
400904: e8 d7 0a 00 00 callq 4013e0 <__libc_start_main> (File Offset: 0x13e0)
400909: f4 hlt
40090a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
上述结果表示_start占用 0x40090a - 0x4008e0 + 6=48字节。我也用
hexdump -C -s `echo ""|awk '{printf("%d", 0x8e0)}'` -n 48 ./a.out
检查如下所示的文件内容
000008e0 31 ed 49 89 d1 5e 48 89 e2 48 83 e4 f0 50 54 49 |1.I..^H..H...PTI|
000008f0 c7 c0 30 19 40 00 48 c7 c1 a0 18 40 00 48 c7 c7 |..0.@.H....@.H..|
00000900 f0 05 40 00 e8 d7 0a 00 00 f4 66 0f 1f 44 00 00 |..@.......f..D..|
00000910
上面的输出与objdump
但是,我感到困惑的是,readelf -s不报告 _start 的大小48而是42. 请参阅下面的命令和输出。
readelf -s ./sub1.r.exe | awk '{if (NR==3) print $0; if ("1277:"==$1) print $0 }'
Num: Value Size Type Bind Vis Ndx Name
1277: 00000000004008e0 42 FUNC GLOBAL DEFAULT 6 _start
为什么readelf不报告48符号的大小_start?
更新
根据评论,我编写了一个 bash 程序来检查.text节中的每个符号。(脚本并不完美,但适用于大多数情况)
while read line; do
symbol=`echo $line | awk '{print $NF}'`
size=`echo $line | awk '{print $3}'`
objdump -d ./sub1.r.exe | awk -v RS= "/^[[:xdigit:]]+ <$symbol>/" > ./aaa.txt
nlines=`cat aaa.txt | wc -l`
[ $nlines -eq 0 ] && continue;
app=$(tail aaa.txt -n 1 | awk -F: '{print $2}' | awk \
'{
for (i=1; i<=NF; i++) {
if (match($i, "\\<[0-9a-f]{2}\\>")){
continue;
}
else{
break;
}
}
print i-1
}')
total=$(cat aaa.txt | awk -v n=$nlines -v a=$app \
'{
if (NR==2){
ns = "0x" substr($1, 0, length($1)-1);
start=strtonum(ns);
}
if (NR==n){
ns = "0x" substr($1, 0, length($1)-1)
stop =strtonum(ns);
}
} END {print stop-start + a}' )
printf "%10d %-10d %4d %s\n" $total $size $((total%16)) $symbol
done < <(readelf -s ./a.out | awk '{if ($7==6 && $3>0) print $0}')
尽管许多符号的大小都遵循对齐约束。上述脚本的输出并不能证明每个符号都会遵守 16 字节对齐约束。他们中的一些人不遵守该约束。您可以使用gcc -static编译任何源文件以获取 ELF 文件以使用上面的脚本对其进行检查。
更新 2
我提取 functionbacktrace_and_maps的反汇编输出,objdump -d如下所示。
0000000000400390 <backtrace_and_maps> (File Offset: 0x390):
400390: ff cf dec %edi
400392: 0f 8e 3a 01 00 00 jle 4004d2 <backtrace_and_maps+0x142> (File Offset: 0x4d2)
400398: 40 84 f6 test %sil,%sil
40039b: 0f 84 31 01 00 00 je 4004d2 <backtrace_and_maps+0x142> (File Offset: 0x4d2)
4003a1: 55 push %rbp
4003a2: 53 push %rbx
4003a3: be 40 00 00 00 mov $0x40,%esi
4003a8: 89 d5 mov %edx,%ebp
4003aa: 48 81 ec 08 06 00 00 sub $0x608,%rsp
4003b1: 48 89 e7 mov %rsp,%rdi
4003b4: e8 97 2c 04 00 callq 443050 <__backtrace> (File Offset: 0x43050)
4003b9: 83 f8 02 cmp $0x2,%eax
4003bc: 41 89 c0 mov %eax,%r8d
4003bf: 0f 8e 04 01 00 00 jle 4004c9 <backtrace_and_maps+0x139> (File Offset: 0x4c9)
4003c5: 48 63 dd movslq %ebp,%rbx
4003c8: ba 1d 00 00 00 mov $0x1d,%edx
4003cd: be 08 26 4a 00 mov $0x4a2608,%esi
4003d2: 48 89 df mov %rbx,%rdi
4003d5: b8 01 00 00 00 mov $0x1,%eax
4003da: 0f 05 syscall
4003dc: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
4003e2: 76 0c jbe 4003f0 <backtrace_and_maps+0x60> (File Offset: 0x3f0)
4003e4: 48 c7 c2 d0 ff ff ff mov $0xffffffffffffffd0,%rdx
4003eb: f7 d8 neg %eax
4003ed: 64 89 02 mov %eax,%fs:(%rdx)
4003f0: 41 8d 70 ff lea -0x1(%r8),%esi
4003f4: 48 8d 7c 24 08 lea 0x8(%rsp),%rdi
4003f9: 89 ea mov %ebp,%edx
4003fb: e8 b0 2c 04 00 callq 4430b0 <__backtrace_symbols_fd> (File Offset: 0x430b0)
400400: ba 1d 00 00 00 mov $0x1d,%edx
400405: be 26 26 4a 00 mov $0x4a2626,%esi
40040a: 48 89 df mov %rbx,%rdi
40040d: b8 01 00 00 00 mov $0x1,%eax
400412: 0f 05 syscall
400414: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
40041a: 76 0c jbe 400428 <backtrace_and_maps+0x98> (File Offset: 0x428)
40041c: 48 c7 c2 d0 ff ff ff mov $0xffffffffffffffd0,%rdx
400423: f7 d8 neg %eax
400425: 64 89 02 mov %eax,%fs:(%rdx)
400428: 31 f6 xor %esi,%esi
40042a: bf 44 26 4a 00 mov $0x4a2644,%edi
40042f: b8 02 00 00 00 mov $0x2,%eax
400434: 0f 05 syscall
400436: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
40043c: 76 10 jbe 40044e <backtrace_and_maps+0xbe> (File Offset: 0x44e)
40043e: 48 c7 c2 d0 ff ff ff mov $0xffffffffffffffd0,%rdx
400445: f7 d8 neg %eax
400447: 64 89 02 mov %eax,%fs:(%rdx)
40044a: 48 83 c8 ff or $0xffffffffffffffff,%rax
40044e: 4c 63 c0 movslq %eax,%r8
400451: 31 ed xor %ebp,%ebp
400453: 41 ba 01 00 00 00 mov $0x1,%r10d
400459: ba 00 04 00 00 mov $0x400,%edx
40045e: 48 8d b4 24 00 02 00 lea 0x200(%rsp),%rsi
400465: 00
400466: 4c 89 c7 mov %r8,%rdi
400469: 89 e8 mov %ebp,%eax
40046b: 0f 05 syscall
40046d: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
400473: 49 89 c1 mov %rax,%r9
400476: 76 1a jbe 400492 <backtrace_and_maps+0x102> (File Offset: 0x492)
400478: 48 c7 c0 d0 ff ff ff mov $0xffffffffffffffd0,%rax
40047f: 41 f7 d9 neg %r9d
400482: 64 44 89 08 mov %r9d,%fs:(%rax)
400486: 4c 89 c7 mov %r8,%rdi
400489: b8 03 00 00 00 mov $0x3,%eax
40048e: 0f 05 syscall
400490: eb 37 jmp 4004c9 <backtrace_and_maps+0x139> (File Offset: 0x4c9)
400492: 48 85 c0 test %rax,%rax
400495: 7e ef jle 400486 <backtrace_and_maps+0xf6> (File Offset: 0x486)
400497: 4c 89 ca mov %r9,%rdx
40049a: 48 8d b4 24 00 02 00 lea 0x200(%rsp),%rsi
4004a1: 00
4004a2: 48 89 df mov %rbx,%rdi
4004a5: 44 89 d0 mov %r10d,%eax
4004a8: 0f 05 syscall
4004aa: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
4004b0: 76 10 jbe 4004c2 <backtrace_and_maps+0x132> (File Offset: 0x4c2)
4004b2: 48 c7 c2 d0 ff ff ff mov $0xffffffffffffffd0,%rdx
4004b9: f7 d8 neg %eax
4004bb: 64 89 02 mov %eax,%fs:(%rdx)
4004be: 48 83 c8 ff or $0xffffffffffffffff,%rax
4004c2: 49 39 c1 cmp %rax,%r9
4004c5: 74 92 je 400459 <backtrace_and_maps+0xc9> (File Offset: 0x459)
4004c7: eb bd jmp 400486 <backtrace_and_maps+0xf6> (File Offset: 0x486)
4004c9: 48 81 c4 08 06 00 00 add $0x608,%rsp
4004d0: 5b pop %rbx
4004d1: 5d pop %rbp
4004d2: c3 retq
00000000004004d3 <detach_arena.part.0> (File Offset: 0x4d3):
4004d3: 50 push %rax
4004d4: b9 68 38 4a 00 mov $0x4a3868,%ecx
4004d9: ba 75 02 00 00 mov $0x275,%edx
4004de: be e8 29 4a 00 mov $0x4a29e8,%esi
4004e3: bf c0 2d 4a 00 mov $0x4a2dc0,%edi
4004e8: e8 93 71 01 00 callq 417680 <__malloc_assert> (File Offset: 0x17680)
我还将偏移量为 0x390 的二进制内容提取到 elf 文件中,长度0x4004d2 - 0x400390 + 1 + 5 = 328如下所示。
00000390 ff cf 0f 8e 3a 01 00 00 40 84 f6 0f 84 31 01 00 |....:...@....1..|
000003a0 00 55 53 be 40 00 00 00 89 d5 48 81 ec 08 06 00 |.US.@.....H.....|
000003b0 00 48 89 e7 e8 97 2c 04 00 83 f8 02 41 89 c0 0f |.H....,.....A...|
000003c0 8e 04 01 00 00 48 63 dd ba 1d 00 00 00 be 08 26 |.....Hc........&|
000003d0 4a 00 48 89 df b8 01 00 00 00 0f 05 48 3d 00 f0 |J.H.........H=..|
000003e0 ff ff 76 0c 48 c7 c2 d0 ff ff ff f7 d8 64 89 02 |..v.H........d..|
000003f0 41 8d 70 ff 48 8d 7c 24 08 89 ea e8 b0 2c 04 00 |A.p.H.|$.....,..|
00000400 ba 1d 00 00 00 be 26 26 4a 00 48 89 df b8 01 00 |......&&J.H.....|
00000410 00 00 0f 05 48 3d 00 f0 ff ff 76 0c 48 c7 c2 d0 |....H=....v.H...|
00000420 ff ff ff f7 d8 64 89 02 31 f6 bf 44 26 4a 00 b8 |.....d..1..D&J..|
00000430 02 00 00 00 0f 05 48 3d 00 f0 ff ff 76 10 48 c7 |......H=....v.H.|
00000440 c2 d0 ff ff ff f7 d8 64 89 02 48 83 c8 ff 4c 63 |.......d..H...Lc|
00000450 c0 31 ed 41 ba 01 00 00 00 ba 00 04 00 00 48 8d |.1.A..........H.|
00000460 b4 24 00 02 00 00 4c 89 c7 89 e8 0f 05 48 3d 00 |.$....L......H=.|
00000470 f0 ff ff 49 89 c1 76 1a 48 c7 c0 d0 ff ff ff 41 |...I..v.H......A|
00000480 f7 d9 64 44 89 08 4c 89 c7 b8 03 00 00 00 0f 05 |..dD..L.........|
00000490 eb 37 48 85 c0 7e ef 4c 89 ca 48 8d b4 24 00 02 |.7H..~.L..H..$..|
000004a0 00 00 48 89 df 44 89 d0 0f 05 48 3d 00 f0 ff ff |..H..D....H=....|
000004b0 76 10 48 c7 c2 d0 ff ff ff f7 d8 64 89 02 48 83 |v.H........d..H.|
000004c0 c8 ff 49 39 c1 74 92 eb bd 48 81 c4 08 06 00 00 |..I9.t...H......|
000004d0 5b 5d c3 50 b9 68 38 4a |[].P.h8J|
000004d8
我也是如下所示grep的输出readelf -s
Num: Value Size Type Bind Vis Ndx Name
103: 0000000000400390 323 FUNC LOCAL DEFAULT 6 backtrace_and_maps
如您所见,函数backtrace_and_maps确实占用 323 个字节,而不是按 16 位或 8 位对齐。