TL;DR
You are right. Those relocations just trying to find out what implementation of (not only) libc functions should be used. They are resolved before the main
is executed by the function __libc_start_main
inserted in the binary at the linking time.
I will try to explain how this relocation type works.
The example
I am using this code as reference
//test.c
#include <stdio.h>
#include <string.h>
int main(void)
{
char tmp[10];
char target[10];
fgets(tmp, 10, stdin);
strcpy(target, tmp);
}
compiled with GCC 7.3.1
gcc -O0 -g -no-pie -fno-pie -o test -static test.c
The shorten output of relocation table (readelf -r test
):
Relocation section '.rela.plt' at offset 0x1d8 contains 21 entries:
Offset Info Type Sym. Value Sym. Name + Addend
...
00000069bfd8 000000000025 R_X86_64_IRELATIV 415fe0
00000069c018 000000000025 R_X86_64_IRELATIV 416060
The shorten output of the section headers (readelf -S test
):
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
...
[19] .got.plt PROGBITS 000000000069c000 0009c000
0000000000000020 0000000000000008 WA 0 0 8
...
It says that .got.plt
section is on the address 0x69c000
.
How is R_X86_64_IRELATIV relocation resolved
Every record in the relocation table contains two important information offset and addend. In the words the addend is pointer to function (also called indirect function) which takes no arguments and returns pointer to function. The returned pointer is placed on the offset from the relocation record.
Simple realocation resolver implementation:
void reolve_reloc(uintptr_t* offset, void* (*addend)())
{
//addend is pointer to function
*offset = addend();
}
From the example at the start of this answer. The last addend from the relocation table points to the address 0x416060
which is function strcpy_ifunc
. See the output from disassembly:
0000000000416060 <strcpy_ifunc>:
416060: f6 05 05 8d 28 00 10 testb $0x10,0x288d05(%rip) # 69ed6c <_dl_x86_cpu_features+0x4c>
416067: 75 27 jne 416090 <strcpy_ifunc+0x30>
416069: f6 05 c1 8c 28 00 02 testb $0x2,0x288cc1(%rip) # 69ed31 <_dl_x86_cpu_features+0x11>
416070: 75 0e jne 416080 <strcpy_ifunc+0x20>
416072: 48 c7 c0 70 dd 42 00 mov $0x42dd70,%rax
416079: c3 retq
41607a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
416080: 48 c7 c0 30 df 42 00 mov $0x42df30,%rax
416087: c3 retq
416088: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
41608f: 00
416090: 48 c7 c0 f0 0e 43 00 mov $0x430ef0,%rax
416097: c3 retq
416098: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
41609f: 00
The strcpy_ifunc
pick the best alternative of all strcpy
implementations adn returns pointer on it. In my case it return address 0x430ef0
which is
__strcpy_sse2_unaligned
. This address is ten put at 0x69c018
which is at .glob.plt + 0x18
Who and when resolve it
Usually the first thought with reallocation is that all this stuff handles dynamic interpreter (ldd
). But in this case the program is statically linked and the .interp
section is empty. In this case it resolved in the function __libc_start_main
which is part of the GLIBC. Except solving relocation this function also take care of passing command line argument to your main
and do some other stuff.
Access to the relocation table
When I figure it out i had last question, how the __libc_start_main
access the relocation table saved in the ELF headers? The first thought was it somehow opens the running binary for reading and process it. Of course this is totally wrong. If you look at the program header of the executable you will see something like this (readlef -l test
):
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x0000000000098451 0x0000000000098451 R E 0x200000
...
The offset in this header is offset from the first byte of the executable file. So what the first item in the program header says is copy first 0x98451 bytes of the test
file into memory. But on the offset 0x0 is ELF header. So with code segment it will also load ELF headers into memory and __libc_start_main
can easily access it.