c - 使用带有 -fPIC 的 dlopen 和 dlsym 编译 C 程序

Question

我遇到了关于错误符号解析的问题。我的主程序使用 dlopen 加载一个共享库，并使用 dlsym 从其中加载一个符号。程序和库都是用 C 编写的。库代码

int a(int b)
{
  return b+1;
}

int c(int d)
{
  return a(d)+1;
}

为了使其在 64 位机器上工作，编译时将 -fPIC 传递给 gcc。

该程序是：

#include <dlfcn.h>
#include <stdio.h>

int (*a)(int b);
int (*c)(int d);

int main()
{
  void* lib=dlopen("./libtest.so",RTLD_LAZY);
  a=dlsym(lib,"a");
  c=dlsym(lib,"c");
  int d = c(6);
  int b = a(5);
  printf("b is %d d is %d\n",b,d);
  return 0;
}

如果程序没有使用 -fPIC 编译，一切都运行良好，但是当程序使用 -fPIC 编译时，它会因分段错误而崩溃。调查导致发现崩溃是由于符号 a 的错误解析造成的。无论是从库还是主程序（后者是通过注释掉主程序中调用 c() 的行获得），都会在调用 a 时发生崩溃。

调用 c() 本身没有问题，可能是因为 c() 不是库本身内部调用的，而 a() 既是库内部使用的函数，又是库的 API 函数。

一个简单的解决方法是在编译程序时不使用 -fPIC。但这并不总是可行的，例如当主程序的代码必须在共享库中时。另一种解决方法是将指向函数 a 的指针重命名为其他内容。但我找不到任何真正的解决方案。

用 RTLD_NOW 替换 RTLD_LAZY 没有帮助。

score 4 · Accepted Answer

I suspect that there is a clash between two global symbols. One solution is to declare a in the main program as static. Alternatively, the linux manpage mentions RTLD_DEEPBIND flag, a linux-only extension, which you can pass to dlopen and which will cause library to prefer its own symbols over global symbols.

score 0 · Accepted Answer

It seems this issue can take place in one more case (like for me). I have a program and a couple of a dynamically linked libs. And when I tried to add one more I used a function from a static lib (my too) in it. And I forgot to add to linkage list this static lib. Linker was not warn me about this, but program was crushing with segmentation fault error.

Maybe this will help for someone.

score 0 · Accepted Answer

FWIW, I ran into a similar problem when compiling as C++ and forgetting about name mangling. A solution there is to use extern "C".

c - 使用带有 -fPIC 的 dlopen 和 dlsym 编译 C 程序

3 回答 3

Related

Reference