linux - 运行程序时会发生什么？

Question

我想在这里收集在 Windows、Linux 和 OSX 上运行可执行文件时会发生什么。特别想了解一下具体的操作顺序：我的猜测是内核加载可执行文件格式（PE、ELF或Mach-O）（但我忽略了ELF的各个部分（Executable和Linkable Format）及其含义），然后你有解析引用的动态链接器，然后__init运行部分可执行文件，然后是main，然后是__fini，然后程序完成，但我相信它很粗糙，并且可能是错误的。

编辑：问题现在是CW。我正在为 linux 做准备。如果有人想为 Win 和 OSX 做同样的事情，那就太好了。

score 35 · Accepted Answer

当然，这只是一个非常高和抽象的层次！

Executable - No Shared Libary: 

Client request to run application
  ->Shell informs kernel to run binary
  ->Kernel allocates memory from the pool to fit the binary image into
  ->Kernel loads binary into memory
  ->Kernel jumps to specific memory address
  ->Kernel starts processing the machine code located at this location
  ->If machine code has stop
  ->Kernel releases memory back to pool

Executable - Shared Library

Client request to run application
  ->Shell informs kernel to run binary
  ->Kernel allocates memory from the pool to fit the binary image into
  ->Kernel loads binary into memory
  ->Kernel jumps to specific memory address
  ->Kernel starts processing the machine code located at this location
  ->Kernel pushes current location into an execution stack
  ->Kernel jumps out of current memory to a shared memory location
  ->Kernel executes code from this shared memory location
  ->Kernel pops back the last memory location and jumps to that address
  ->If machine code has stop
  ->Kernel releases memory back to pool

JavaScript/.NET/Perl/Python/PHP/Ruby (Interpretted Languages)

Client request to run application
  ->Shell informs kernel to run binary
  ->Kernel has a hook that recognises binary images needs a JIT
  ->Kernel calls JIT
  ->JIT loads the code and jumps to a specific address
  ->JIT reads the code and compiles the instruction into the 
    machine code that the interpretter is running on
  ->Interpretture passes machine code to the kernel
  ->kernel executes the required instruction
  ->JIT then increments the program counter
  ->If code has a stop
  ->Jit releases application from its memory pool

正如 routeNpingme 所说，寄存器设置在 CPU 内部，奇迹发生了！

更新：是的，我今天拼写不正确！

score 32 · Accepted Answer

好的，回答我自己的问题。这将逐步完成，并且仅适用于 Linux（可能还有 Mach-O）。随意在您的个人答案中添加更多内容，以便他们获得支持（并且您可以获得徽章，因为它现在是 CW）。

我将开始中途，并根据我的发现构建其余部分。本文档是使用 x86_64、gcc (GCC) 4.1.2 制作的。

打开文件，初始化

在本节中，我们从内核的角度描述调用程序时会发生什么，直到程序准备好执行。

ELF 打开。
内核查找 .text 部分并将其加载到内存中。将其标记为只读
内核加载 .data 部分
内核加载 .bss 部分，并将所有内容初始化为零。
内核将控制权转移到动态链接器（其名称在 ELF 文件中的 .interp 部分中）。动态链接器解析所有共享库调用。
控制权转移到应用程序

程序的执行

函数 _start 被调用，因为 ELF 标头将其指定为可执行文件的入口点
_start 调用 glibc 中的 __libc_start_main（通过 PLT）将以下信息传递给它
1. 实际主函数的地址
2. argc 地址
3. argv 地址
4. _init 例程的地址
5. _fini 例程的地址
6. atexit() 注册的函数指针
7. 可用的最高堆栈地址
_init 被调用
1. 调用 call_gmon_start 来初始化 gmon 分析。与执行无关。
2. 调用 frame_dummy，它包装了 __register_frame_info(eh_frame 部分地址，bss 部分地址) （修复：这个函数做什么？显然从 BSS 部分初始化全局变量）
3. 调用 __do_global_ctors_aux，其作用是调用 .ctors 部分中列出的所有全局构造函数。
main 被调用
主要目的
_fini 被调用，然后调用 __do_global_dtors_aux 来运行 .dtors 部分中指定的所有析构函数。
程序退出。

score 5 · Accepted Answer

在 Windows 上，首先将图像加载到内存中。内核分析它将需要哪些库（读取“DLL”）并加载它们。

然后它编辑程序映像以插入它需要的每个库函数的内存地址。这些地址在 .EXE 二进制文件中已经有一个空格，但它们只是用零填充。

然后每个 DLL 的 DllMain() 过程从最需要的 DLL 到最后一个一个一个地执行，就像遵循依赖顺序一样。

一旦所有的库都加载完毕并准备就绪，最后就会启动映像，现在发生的任何事情都取决于使用的语言、使用的编译器以及程序本身。

score 2 · Accepted Answer

2

一旦图像被加载到内存中，魔法就会接管。

于 2009-07-30T02:31:38.893 回答

score 0 · Accepted Answer

好吧，根据您的确切定义，您必须考虑 .Net 和 Java 等语言的 JIT 编译器。当您运行在技术上不是“可执行”的.Net“exe”时，JIT 编译器会介入并编译它。

linux - 运行程序时会发生什么？

5 回答 5

打开文件，初始化

程序的执行

Related

Reference