0

我将 CLion 与一个 c++ 项目 (cmake) 一起使用,该项目启动一个 jvm。java部分是用gradle构建的。该项目有效,但我在调试时遇到问题。

当我启动 JVM 时,我立即得到一个 SIGSEGV。我知道这是正常的,除了忽略 SIGSEGV 之外没有其他解决方法。有点烦人但还不错,因为每次会话只发生一次。

但是,在那之后,我继续调试,我得到了持续的 SIGBUS 信号。

<unknown> 0x000000011f108385
<unknown> 0x000000011761dca7
<unknown> 0x000000011761dca7
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761dae2
<unknown> 0x000000011761da00
<unknown> 0x000000011761dae2
<unknown> 0x000000011761da00
<unknown> 0x000000011761dae2
<unknown> 0x000000011761da00
<unknown> 0x0000000117614849
JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*) 0x000000010bf3a582
StackWalk::fetchFirstBatch(BaseFrameStream&, Handle, long, int, int, int, objArrayHandle, Thread*) 0x000000010c227cac
StackWalk::walk(Handle, long, int, int, int, objArrayHandle, Thread*) 0x000000010c2278fc
JVM_CallStackWalk 0x000000010bfb14a2
<unknown> 0x0000000117623950
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761dae2
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x0000000117614849
JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*) 0x000000010bf3a582
InstanceKlass::call_class_initializer(Thread*) 0x000000010bf22af7
InstanceKlass::initialize_impl(Thread*) 0x000000010bf2244f
Reflection::invoke_constructor(oopDesc*, objArrayHandle, Thread*) 0x000000010c1ebdbb
JVM_NewInstanceFromConstructor 0x000000010bfc14f6
<unknown> 0x0000000117623950
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761dae2
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761dae2
<unknown> 0x000000011761dae2
<unknown> 0x000000011761dcec
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761dae2
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761dae2
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x000000011761da00
<unknown> 0x0000000117614849
JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*) 0x000000010bf3a582
jni_invoke_static(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) 0x000000010bf7e2af
jni_CallStaticVoidMethodV 0x000000010bf81c69
JNIEnv_::CallStaticVoidMethod(_jclass*, _jmethodID*, ...) jni.h:1521
main main.cpp:80
start 0x00007fff6f6563d5
start 0x00007fff6f6563d5

它并没有在我的代码中停止。除了忽略所有 SIGBUS 之外,我不明白为什么会发生这种情况,或者是否有可能避免它们。

我最小化了我的代码并创建了最简单的示例来重现该问题。基本上,我创建了一个 cpp 项目,它以org/junit/platform/console/ConsoleLauncher作为主要(junit5)启动一个 jni,这使得一个简单的测试。SIGBUS 发生了。它发生在我的测试甚至运行之前。

我怀疑 JUnit 中的某些东西,但不确定。有什么办法可以找到根本原因?

复制示例项目在这里:https ://github.com/tallavi/sigbus-reproduction

如果我运行它,你可以看到代码在调用 java 部分后停止运行,没有“调用后”,没有“CppMainEnd”:

CppMainStart
current_path: /Users/tal/Development/v2x/qa-automation/sigbus-reproduction/out
Loading JAR: jars/junit-platform-console-standalone-1.5.2.jar
Loading JAR: jars/.DS_Store
Loading JAR: jars/junit-platform-console-standalone-1.6.0-M1.jar
Loading JAR: jars/sigbus-reproduction.jar
CreateVM:       JVM loaded successfully!
Before call
test START
test END

Thanks for using JUnit! Support its development at https://junit.org/sponsoring

.
+-- JUnit Jupiter [OK]
| '-- FirstTest [OK]
|   '-- myTest() [OK]
'-- JUnit Vintage [OK]

Test run finished after 154 ms
[         3 containers found      ]
[         0 containers skipped    ]
[         3 containers started    ]
[         0 containers aborted    ]
[         3 containers successful ]
[         0 containers failed     ]
[         1 tests found           ]
[         0 tests skipped         ]
[         1 tests started         ]
[         0 tests aborted         ]
[         1 tests successful      ]
[         0 tests failed          ]


Process finished with exit code 0

如果我只是将 main 从 JUnit5 更改为我的 main 并运行相同的代码,一切正常:

CppMainStart
current_path: /Users/tal/Development/v2x/qa-automation/sigbus-reproduction/out
Loading JAR: jars/junit-platform-console-standalone-1.5.2.jar
Loading JAR: jars/.DS_Store
Loading JAR: jars/junit-platform-console-standalone-1.6.0-M1.jar
Loading JAR: jars/sigbus-reproduction.jar
CreateVM:       JVM loaded successfully!
Before call
main START
main END
After call
CppMainEnd

Process finished with exit code 0

我按照@Oo.oO 的建议处理了信号,但它当然不能解决问题。Java 代码完成,但如果我尝试访问该 JVM,例如,销毁它,它就会挂起!: 挂起的堆栈跟踪

但是,如果我让它运行(而不是尝试调试它),它会因不同的错误而崩溃:

main(31549,0x1177515c0) malloc: *** error for object 0x7ffee6360628: pointer being freed was not allocated
main(31549,0x1177515c0) malloc: *** set a breakpoint in malloc_error_break to debug

有了这个跟踪:

处理信号后销毁jvm时

请注意,SIGBUS 并不总是发生,但 JVM 调用之后的代码 100% 的时间停止运行。

希望这对任何人都有意义..

更新:这是它在 lldb 中的外观:

MyComputer:out tal$ lldb main
(lldb) target create "main"
Current executable set to 'main' (x86_64).
(lldb) r
Process 57274 launched: '/Users/tal/Development/v2x/qa-automation/sigbus-reproduction/out/main' (x86_64)
CppMainStart
Process 57274 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSEGV
    frame #0: 0x000000010b33f51b
->  0x10b33f51b: movl   (%rsi), %eax
    0x10b33f51d: leaq   0x30(%rbp), %rsi
    0x10b33f521: movl   $0x10000, %eax            ; imm = 0x10000
    0x10b33f526: andl   0x4(%rsi), %eax
Target 0: (main) stopped.
(lldb) c
Process 57274 resuming
CreateVM:       JVM loaded successfully!
Before call
Process 57274 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGBUS
    frame #0: 0x0000000112e263ff
->  0x112e263ff: testl  %eax, (%r10)
    0x112e26402: retq
    0x112e26403: nop
    0x112e26404: nop
Target 0: (main) stopped.
(lldb) c
Process 57274 resuming
test START
test END

Thanks for using JUnit! Support its development at https://junit.org/sponsoring

╷
├─ JUnit Jupiter ✔
│  └─ FirstTest ✔
│     └─ myTest() ✔
└─ JUnit Vintage ✔

Test run finished after 2740 ms
[         3 containers found      ]
[         0 containers skipped    ]
[         3 containers started    ]
[         0 containers aborted    ]
[         3 containers successful ]
[         0 containers failed     ]
[         1 tests found           ]
[         0 tests skipped         ]
[         1 tests started         ]
[         0 tests aborted         ]
[         1 tests successful      ]
[         0 tests failed          ]

After call
before destroying
after destroying
CppMainEnd
Process 57274 exited with status = 0 (0x00000000)
4

2 回答 2

1

如果不确切知道您拥有什么环境,可能很难找到。这里有多个因素:

  • 升压版
  • 爪哇版
  • 编译器版本
  • 等等

如果我拿你的样品,把它剥到最低限度(像这样)

# Linux

> g++ -o obj/main \
  -I${JAVA_HOME}/include -I${JAVA_HOME}/include/linux/ \
  -L${JAVA_HOME}/jre/lib/amd64/server -ljvm \
  -L${BOOST_LIB} -lboost_system -lboost_filesystem \
  -I$BOOST_INC src/main/cpp/main.cpp

> javac -cp jars/junit-platform-console-standalone.jar \
  -d target src/main/java/FirstTest.java

> jar cf jars/sigbus-reproduction.jar -C target .

> ./obj/main

或者,在 macOS 上稍作修改

# macOS

> g++ -std=c++11 -o obj/main \
  -I${JAVA_HOME}/include -I${JAVA_HOME}/include/darwin/ \
  -L${JAVA_HOME}/lib/server -rpath ${JAVA_HOME}/lib/server -ljvm \
  -L${BOOST_LIB} -rpath ${BOOST_LIB} -lboost_system -lboost_filesystem \
  -I$BOOST_INC src/main/cpp/main.cpp

它只是按预期工作。此外,既没有SIGSEGV也没有SIGBUSinside gdblldb

> ./obj/main
CppMainStart
current_path: /Users/michalo/tmp/sigbus-reproduction
Loading JAR: jars/junit-platform-console-standalone.jar
Loading JAR: jars/sigbus-reproduction.jar
CreateVM:       JVM loaded successfully!
test START
test END

Thanks for using JUnit! Support its development at https://junit.org/sponsoring

╷
├─ JUnit Jupiter ✔
│  └─ FirstTest ✔
│     └─ myTest() ✔
└─ JUnit Vintage ✔

Test run finished after 5061 ms
[         3 containers found      ]
[         0 containers skipped    ]
[         3 containers started    ]
[         0 containers aborted    ]
[         3 containers successful ]
[         0 containers failed     ]
[         1 tests found           ]
[         0 tests skipped         ]
[         1 tests started         ]
[         0 tests aborted         ]
[         1 tests successful      ]
[         0 tests failed          ]

我想,找到可以重现您的问题的人可能需要时间和精力。

调用 JUnit 作为方法

#include <iostream>
...
...
...

int main(int argc, char **argv) {

  // make sure to store oryginal stdout
  // JVM (JUnit) will mess with it
  int old_stdout = dup(1);

  std::cout << "CppMainStart" << std::endl;

...
...
...

  env->SetObjectArrayElement(argsArray, 0, env->NewStringUTF("--class-path"));
  env->SetObjectArrayElement(argsArray, 1, env->NewStringUTF(V2X_FILE_NAME.c_str()));
  env->SetObjectArrayElement(argsArray, 2, env->NewStringUTF((std::string("--scan-classpath")).c_str()));

// instead of calling main, you can call execute

  jclass system_class     = env->FindClass( "java/lang/System");
  jfieldID field_id_out   = env->GetStaticFieldID(system_class, "out", "Ljava/io/PrintStream;");
  jobject field_id_out_v  = env->GetStaticObjectField(system_class, field_id_out);

  jfieldID field_id_err   = env->GetStaticFieldID(system_class, "err", "Ljava/io/PrintStream;");
  jobject field_id_err_v  = env->GetStaticObjectField(system_class, field_id_err);

  jmethodID execMethod = env->GetStaticMethodID(mainClass,
    "execute",
    "(Ljava/io/PrintStream;Ljava/io/PrintStream;[Ljava/lang/String;)Lorg/junit/platform/console/ConsoleLauncherExecutionResult;");

  jobject result = env->CallStaticObjectMethod(mainClass, execMethod, field_id_out_v, field_id_err_v, argsArray);

  jvm->DestroyJavaVM();

  // restore oryginal stdout
  FILE *fp2 = fdopen(old_stdout, "w");
  *stdout = *fp2;

  std::cout  << "CppMainEnd" << std::endl << std::flush;

  return 0;
}

给你。有CppMainEnd在最后。

> ./obj/main
CppMainStart
current_path: /Users/michalo/tmp/sigbus-reproduction
Loading JAR: jars/junit-platform-console-standalone.jar
Loading JAR: jars/sigbus-reproduction.jar
CreateVM:       JVM loaded successfully!
test START
test END

Thanks for using JUnit! Support its development at https://junit.org/sponsoring

╷
├─ JUnit Jupiter ✔
│  └─ FirstTest ✔
│     └─ myTest() ✔
└─ JUnit Vintage ✔

Test run finished after 5060 ms
[         3 containers found      ]
[         0 containers skipped    ]
[         3 containers started    ]
[         0 containers aborted    ]
[         3 containers successful ]
[         0 containers failed     ]
[         1 tests found           ]
[         0 tests skipped         ]
[         1 tests started         ]
[         0 tests aborted         ]
[         1 tests successful      ]
[         0 tests failed          ]

CppMainEnd

我建议尽量减少代码的内容。制造是尽可能重要的。否则,您将很难找到问题的根源。

如果我运行这种代码(这真的很接近 JNI 调用的本质)。

#include <iostream>
#include <jni.h>
#include <unistd.h>

int main(int argc, char **argv) {

  int old_stdout = dup(1);

  std::cout << "Cpp Start" << std::endl;

  JavaVM *jvm;
  JNIEnv *env;
  JavaVMInitArgs vm_args;
  JavaVMOption* options = new JavaVMOption[1];

  options[0].optionString = const_cast<char *>("-Djava.class.path=jars/junit-platform-console-standalone.jar:jars/sigbus-reproduction.jar");
  vm_args.version = JNI_VERSION_1_6;
  vm_args.nOptions = 1;
  vm_args.options = options;
  vm_args.ignoreUnrecognized = false;

  long status = JNI_CreateJavaVM(&jvm, (void**)&env, &vm_args);

  jclass mainClass = env->FindClass("org/junit/platform/console/ConsoleLauncher");

  jclass stringClass = env->FindClass("java/lang/String");

  jobject emptyStringObject = env->NewStringUTF("");

  jobjectArray argsArray = env->NewObjectArray(3, stringClass, emptyStringObject);

  env->SetObjectArrayElement(argsArray, 0, env->NewStringUTF("--class-path"));
  env->SetObjectArrayElement(argsArray, 1, env->NewStringUTF("jars/sigbus-reproduction.jar"));
  env->SetObjectArrayElement(argsArray, 2, env->NewStringUTF("--scan-classpath"));

  jclass system_class     = env->FindClass( "java/lang/System");
  jfieldID field_id_out   = env->GetStaticFieldID(system_class, "out", "Ljava/io/PrintStream;");
  jobject field_id_out_v  = env->GetStaticObjectField(system_class, field_id_out);

  jfieldID field_id_err   = env->GetStaticFieldID(system_class, "err", "Ljava/io/PrintStream;");
  jobject field_id_err_v  = env->GetStaticObjectField(system_class, field_id_err);

  jmethodID execMethod = env->GetStaticMethodID(mainClass,
    "execute",
    "(Ljava/io/PrintStream;Ljava/io/PrintStream;[Ljava/lang/String;)Lorg/junit/platform/console/ConsoleLauncherExecutionResult;");

  jobject result = env->CallStaticObjectMethod(mainClass, execMethod, field_id_out_v, field_id_err_v, argsArray);

  jvm->DestroyJavaVM();

  // restore oryginal stdout
  FILE *fp2 = fdopen(old_stdout, "w");
  *stdout = *fp2;

  std::cout  << "CppMainEnd" << std::endl << std::flush;

  delete[] options;

  return 0;
}

没有什么奇怪的lldb

lldb obj/main
(lldb) target create "obj/main"
Current executable set to 'obj/main' (x86_64).
(lldb) run
Process 921 launched: '.../main' (x86_64)
Cpp Start
Process 921 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSEGV
    frame #0: 0x000000010b33f51b
->  0x10b33f51b: movl   (%rsi), %eax
    0x10b33f51d: leaq   0x30(%rbp), %rsi
    0x10b33f521: movl   $0x10000, %eax            ; imm = 0x10000
    0x10b33f526: andl   0x4(%rsi), %eax
Target 0: (main) stopped.
(lldb) cont
Process 921 resuming
test START
test END

Thanks for using JUnit! Support its development at https://junit.org/sponsoring

╷
├─ JUnit Jupiter ✔
│  └─ FirstTest ✔
│     └─ myTest() ✔
└─ JUnit Vintage ✔

Test run finished after 5060 ms
[         3 containers found      ]
[         0 containers skipped    ]
[         3 containers started    ]
[         0 containers aborted    ]
[         3 containers successful ]
[         0 containers failed     ]
[         1 tests found           ]
[         0 tests skipped         ]
[         1 tests started         ]
[         0 tests aborted         ]
[         1 tests successful      ]
[         0 tests failed          ]

CppMainEnd
Process 921 exited with status = 0 (0x00000000)

多次运行

无论我运行代码多少次,都没有SIGBUS:(

您可以像这样轻松地运行代码(数千次):

--- 8< --- CUT HERE --- lldb_run --- 8< --- CUT HERE ---

target create main
break set -n main -C "process handle --pass true --stop false SIGSEGV" -C "continue"
run
script import os; os._exit(0)

--- 8< --- CUT HERE --- lldb_run --- 8< --- CUT HERE ---

然后,在循环中运行它:for i in {1..100}; do lldb --source ./lldb_run; done

于 2019-12-18T14:58:11.310 回答
0

您错误地假设信号如SIGSEGVorSIGBUS表示 Java 中的问题。您也可能会破坏诸如空指针检测之类的事情。

为什么我在 Linux 上对 Java 应用程序进行 strace 时会看到 SIGSEGV?!

主条目

大多数使用 Unix 一段时间的人都熟悉偶尔会从编写不佳的程序中看到“分段错误(核心转储)”。如果这就是您对 Unix 的全部了解,并且您在 Java 进程上查看了 strace 的输出,您会认为有些地方出了问题(“哇,看看所有这些段错误。Sun/Oracle 的那些人一定是糟糕的程序员,他们不不知道他们到底在做什么!”)。

真实情况完全不同SIGSEGV——Java 进程几乎总是完全正常且完全安全的。

...

JVM 是一个多线程进程,因此在幕后它使用信号来执行操作系统级别的线程。...

...

信号说明

  • SIGSEGV, SIGBUS, SIGFPE, SIGPIPE,SIGILL 用于隐式空值检查等的实现。
  • SIGQUIT 线程转储支持:在标准错误流中转储 Java 堆栈跟踪。(可选的。)

...

从http://download.oracle.com/javase/7/docs/webnotes/tsg/TSG-VM/html/signals.html批发的桌子被盗

根据该链接

6.1 Solaris OS 和 Linux 上的信号处理

HotSpot 虚拟机安装信号处理程序来实现各种功能并处理致命错误情况。例如,在很少抛出 java.lang.NullPointerException 的情况下,为了避免显式空检查的优化,SIGSEGV 信号被捕获和处理,并且 NullPointerException 被抛出。

一般来说,有两类情况会出现信号/陷阱。

  • 预期和处理信号的情况。示例包括上面引用的隐式 null 处理。另一个例子是安全点轮询机制,它在需要安全点时保护内存中的页面。访问该页面的任何线程都会导致SIGSEGV,这会导致执行存根,将线程带到安全点。

  • 意料之外的信号。这包括SIGSEGV在 VM 代码、JNI 代码或本机代码中执行时。在这些情况下,信号是意外的,因此会调用致命错误处理来创建错误日志并终止进程。

如果您需要处理致命信号,请参阅使用 Java/JNI 时 Linux 上的信号处理

于 2019-12-21T12:20:27.317 回答