linux - 动态加载和弱符号分辨率

Question

分析这个问题，我发现了一些关于dlopen在 Linux 上动态加载（）上下文中弱符号解析的行为。现在我正在寻找管理这个的规范。

让我们举个例子。假设有一个程序a可以动态加载库，b.so并c.so按此顺序。如果c.so依赖于其他两个库foo.so（实际上libgcc.so在那个例子中）和bar.so（实际上），那么通常可以使用libpthread.so导出的符号来满足. 但如果还依赖而不依赖，那么这些弱符号显然不会被联系起来。似乎inkages 只从它们的所有依赖项中寻找符号。bar.sofoo.sob.sofoo.sobar.sobar.sofoo.soab.so

这在某种程度上是有道理的，因为否则加载c.so可能会改变已经使用该库foo.so的某个点的行为。b.so另一方面，在让我开始的问题中，这引起了很多麻烦，所以我想知道是否有办法解决这个问题。为了找到解决方法，我首先需要很好地了解在这些情况下如何指定符号解析的非常准确的细节。

在这些场景中定义正确行为的规范或其他技术文档是什么？

score 14 · Accepted Answer

Unfortunately, the authoritative documentation is the source code. Most distributions of Linux use glibc or its fork, eglibc. In the source code for both, the file that should document dlopen() reads as follows:

manual/libdl.texi

@c FIXME these are undocumented:
@c dladdr
@c dladdr1
@c dlclose
@c dlerror
@c dlinfo
@c dlmopen
@c dlopen
@c dlsym
@c dlvsym

What technical specification there is can be drawn from the ELF specification and the POSIX standard. The ELF specification is what makes a weak symbol meaningful. POSIX is the actual specification for dlopen() itself.

This is what I find to be the most relevant portion of the ELF specification.

When the link editor searches archive libraries, it extracts archive members that contain definitions of undefined global symbols. The member’s definition may be either a global or a weak symbol.

The ELF specification makes no reference to dynamic loading so the rest of this paragraph is my own interpretation. The reason I find the above relevant is that resolving symbols occurs at a single "when". In the example you give, when program a dynamically loads b.so, the dynamic loader attempts to resolve undefined symbols. It may end up doing so with either global or weak symbols. When the program then dynamically loads c.so, the dynamic loader again attempts to resolve undefined symbols. In the scenario you describe, symbols in b.so were resolved with weak symbols. Once resolved, those symbols are no longer undefined. It doesn't matter if global or weak symbols were used to defined them. They're already no longer undefined by the time c.so is loaded.

The ELF specification gives no precise definition of what a link editor is or when the link editor must combine object files. Presumably it's a non-issue because the document has dynamic-linking in mind.

POSIX describes some of the dlopen() functionality but leaves much up to the implementation, including the substance of your question. POSIX makes no reference to the ELF format or weak symbols in general. For systems implementing dlopen() there need not even be any notion of weak symbols.

http://pubs.opengroup.org/onlinepubs/9699919799/functions/dlopen.html

POSIX compliance is part of another standard, the Linux Standard Base. Linux distributions may or may not choose to follow these standards and may or may not go to the trouble of being certified. For example, I understand that a formal Unix certification by Open Group is quite expensive -- hence the abundance of "Unix-like" systems.

An interesting point about the standards compliance of dlopen() is made on the Wikipedia article for dynamic loading. dlopen(), as mandated by POSIX, returns a void*, but C, as mandated by ISO, says that a void* is a pointer to an object and such a pointer is not necessarily compatible with a function pointer.

The fact remains that any conversion between function and object pointers has to be regarded as an (inherently non-portable) implementation extension, and that no "correct" way for a direct conversion exists, since in this regard the POSIX and ISO standards contradict each other.

The standards that do exist contradict and what standards documents there are may not be especially meaningful anyway. Here's Ulrich Drepper writing about his disdain for Open Group and their "specifications".

http://udrepper.livejournal.com/8511.html

Similar sentiment is expressed in the post linked by rodrigo.

The reason I've made this change is not really to be more conformant (it's nice but no reason since nobody complained about the old behaviour).

After looking into it, I believe the proper answer to the question as you've asked it is that there is no right or wrong behavior for dlopen() in this regard. Arguably, once a search has resolved a symbol it is no longer undefined and in subsequent searches the dynamic loader will not attempt to resolve the already defined symbol.

Finally, as you state in the comments, what you describe in the original post is not correct. Dynamically loaded shared libraries can be used to resolve undefined symbols in previously dynamically loaded shared libraries. In fact, this isn't limited to undefined symbols in dynamically loaded code. Here is an example in which the executable itself has an undefined symbol that is resolved through dynamic loading.

main.c

#include <dlfcn.h>

void say_hi(void);

int main(void) {
    void* symbols_b = dlopen("./dyload.so", RTLD_NOW | RTLD_GLOBAL);
    /* uh-oh, forgot to define this function */
    /* better remember to define it in dyload.so */
    say_hi();
    return 0;
}

dyload.c

#include <stdio.h>
void say_hi(void) {
    puts("dyload.so: hi");
}

Compile and run.

gcc-4.8 main -fpic -ldl -Wl,--unresolved-symbols=ignore-all -o main
gcc-4.8 dyload.c -shared -fpic -o dyload.so
$ ./main
dyload.so: hi

Note that the main executable itself was compiled as PIC.

linux - 动态加载和弱符号分辨率

1 回答 1

Related

Reference