我正在用Sage做一些计算。我在玩fork
。我有一个非常简单的测试用例,基本上是这样的:
def fork_test():
import os
pid = os.fork()
if pid != 0:
print "parent, child: %i" % pid
os.waitpid(pid, 0)
else:
print "child"
try:
# some dummy matrix calculation
finally:
os._exit(0)
(查看下面的_fork_test_func()
一些矩阵计算。)
我得到:
------------------------------------------------------------------------
Unhandled SIGILL: An illegal instruction occurred in Sage.
This probably occurred because a *compiled* component of Sage has a bug
in it and is not properly wrapped with sig_on(), sig_off(). You might
want to run Sage under gdb with 'sage -gdb' to debug this.
Sage will now terminate.
------------------------------------------------------------------------
有了这个(不完整的)回溯:
Crashed Thread: 0 Dispatch queue: com.apple.root.default-priority
Exception Type: EXC_BAD_INSTRUCTION (SIGILL)
Exception Codes: 0x0000000000000001, 0x0000000000000000
Application Specific Information:
BUG IN LIBDISPATCH: flawed group/semaphore logic
Thread 0 Crashed:: Dispatch queue: com.apple.root.default-priority
0 libsystem_kernel.dylib 0x00007fff8c6d1d46 __kill + 10
1 libcsage.dylib 0x0000000101717f33 sigdie + 124
2 libcsage.dylib 0x0000000101717719 sage_signal_handler + 364
3 libsystem_c.dylib 0x00007fff86b1094a _sigtramp + 26
4 libdispatch.dylib 0x00007fff89a66c74 _dispatch_thread_semaphore_signal + 27
5 libdispatch.dylib 0x00007fff89a66f3e _dispatch_apply2 + 143
6 libdispatch.dylib 0x00007fff89a66e30 dispatch_apply_f + 440
7 libBLAS.dylib 0x00007fff906ca435 APL_dtrsm + 1963
8 libBLAS.dylib 0x00007fff906702b6 cblas_dtrsm + 882
9 matrix_modn_dense_double.so 0x0000000108612615 void FFLAS::Protected::ftrsmRightLowerNoTransUnit<double>::delayed<FFPACK::Modular<double> >(FFPACK::Modular<double> const&, unsigned long, unsigned long, FFPACK::Modular<double>::Element*, unsigned long, FFPACK::Modular<double>::Element*, unsigned long, unsigned long, unsigned long) + 2853
10 matrix_modn_dense_double.so 0x0000000108611daa void FFLAS::Protected::ftrsmRightLowerNoTransUnit<double>::delayed<FFPACK::Modular<double> >(FFPACK::Modular<double> const&, unsigned long, unsigned long, FFPACK::Modular<double>::Element*, unsigned long, FFPACK::Modular<double>::Element*, unsigned long, unsigned long, unsigned long) + 698
11 matrix_modn_dense_double.so 0x0000000108612ccf void FFLAS::Protected::ftrsmRightLowerNoTransUnit<double>::operator()<FFPACK::Modular<double> >(FFPACK::Modular<double> const&, unsigned long, unsigned long, FFPACK::Modular<double>::Element*, unsigned long, FFPACK::Modular<double>::Element*, unsigned long) + 831
12 ??? 0x00007f99e481a028 0 + 140298940424232
Thread 1:
0 libsystem_kernel.dylib 0x00007fff8c6d26d6 __workq_kernreturn + 10
1 libsystem_c.dylib 0x00007fff86b24f4c _pthread_workq_return + 25
2 libsystem_c.dylib 0x00007fff86b24d13 _pthread_wqthread + 412
3 libsystem_c.dylib 0x00007fff86b0f1d1 start_wqthread + 13
Thread 2:
0 libsystem_kernel.dylib 0x00007fff8c6d26d6 __workq_kernreturn + 10
1 libsystem_c.dylib 0x00007fff86b24f4c _pthread_workq_return + 25
2 libsystem_c.dylib 0x00007fff86b24d13 _pthread_wqthread + 412
3 libsystem_c.dylib 0x00007fff86b0f1d1 start_wqthread + 13
Thread 0 crashed with X86 Thread State (64-bit):
rax: 0x0000000000000000 rbx: 0x00007fff5ec8e418 rcx: 0x00007fff5ec8df28 rdx: 0x0000000000000000
rdi: 0x000000000000b8f7 rsi: 0x0000000000000004 rbp: 0x00007fff5ec8df40 rsp: 0x00007fff5ec8df28
r8: 0x00007fff5ec8e418 r9: 0x0000000000000000 r10: 0x000000000000000a r11: 0x0000000000000202
r12: 0x00007f99ea500de0 r13: 0x0000000000000003 r14: 0x00007fff5ec8e860 r15: 0x00007fff906ca447
rip: 0x00007fff8c6d1d46 rfl: 0x0000000000000202 cr2: 0x00007fff74a29848
Logical CPU: 0
之后有什么特别的事情我需要做的fork
吗?我查看了fork
Sage 的装饰器,它看起来基本上是一样的。
崩溃也发生在fork
Sage 本身的装饰器上。另一个测试用例:
def fork_test2():
def test():
# do some stuff
from sage.parallel.decorate import fork
test_ = fork(test, verbose=True)
test_()
更简单的测试用例:
def _fork_test_func():
while True:
m = matrix(QQ, 100, [randrange(-100,100) for i in range(100*100)])
m.right_kernel()
def fork_test():
import os
pid = os.fork()
if pid != 0:
print "parent, child: %i" % pid
os.waitpid(pid, 0)
else:
print "child"
try:
_fork_test_func()
finally:
os._exit(0)
导致稍微不同的崩溃:
python(48672) malloc: *** error for object 0x11185f000: pointer being freed already on death-row
*** set a breakpoint in malloc_error_break to debug
使用回溯:
Crashed Thread: 1 Dispatch queue: com.apple.root.default-priority
Exception Type: EXC_CRASH (SIGABRT)
Exception Codes: 0x0000000000000000, 0x0000000000000000
Application Specific Information:
*** error for object 0x11185f000: pointer being freed already on death-row
Thread 0:: Dispatch queue: com.apple.main-thread
0 matrix2.so 0x0000000107fa403f __pyx_pw_4sage_6matrix_7matrix2_6Matrix_71right_kernel_matrix + 27551
1 ??? 0x000000000000000d 0 + 13
Thread 1 Crashed:: Dispatch queue: com.apple.root.default-priority
0 libsystem_kernel.dylib 0x00007fff8c6d239a __semwait_signal_nocancel + 10
1 libsystem_c.dylib 0x00007fff86b17e1b nanosleep$NOCANCEL + 138
2 libsystem_c.dylib 0x00007fff86b7b9a8 usleep$NOCANCEL + 54
3 libsystem_c.dylib 0x00007fff86b67eca __abort + 203
4 libsystem_c.dylib 0x00007fff86b67dff abort + 192
5 libsystem_c.dylib 0x00007fff86b43905 szone_error + 580
6 libsystem_c.dylib 0x00007fff86b43f7d free_large + 229
7 libsystem_c.dylib 0x00007fff86b3b8f8 free + 199
8 libBLAS.dylib 0x00007fff906b0431 __APL_dgemm_block_invoke_0 + 132
9 libdispatch.dylib 0x00007fff89a65f01 _dispatch_call_block_and_release + 15
10 libdispatch.dylib 0x00007fff89a620b6 _dispatch_client_callout + 8
11 libdispatch.dylib 0x00007fff89a631fa _dispatch_worker_thread2 + 304
12 libsystem_c.dylib 0x00007fff86b24d0b _pthread_wqthread + 404
13 libsystem_c.dylib 0x00007fff86b0f1d1 start_wqthread + 13
同样的情况也发生在这方面:
def fork_test2():
from sage.parallel.decorate import fork
test_ = fork(_fork_test_func, verbose=True)
test_()
-- 但前提是您之前使用过其他一些矩阵计算。
此测试用例也适用于新的 Sage 会话:
def _fork_test_func(iterator=None):
if not iterator:
import itertools
iterator = itertools.count()
for i in iterator:
m = matrix(QQ, 100, [randrange(-100,100) for i in range(100*100)])
m.right_kernel()
def fork_test():
_fork_test_func(range(10))
import os
pid = os.fork()
if pid != 0:
print "parent, child: %i" % pid
os.waitpid(pid, 0)
else:
print "child"
try:
_fork_test_func()
finally:
os._exit(0)
我已经下载了 MacOSX 64bit 的 Sage 5.8 的二进制文件。
(请注意,我也在 ask.sagemath.org 上问过。)