4

Python 2.7.3 on Solaris 10

Questions

  1. When my subprocess has an internal Segmentation Fault(core) issue or a user externally kills it from the shell with a SIGTERM or SIGKILL, my main program's signal handler handles a SIGTERM(-15) and my parent program exits. Is this real? or is it a bad python build?

Background and Code

I have a python script that first spawns a worker management thread. The worker management thread then spawns one or more worker threads. I have other stuff going on in my main thread that I cannot block. My management thread stuff and worker threads are rock-solid. My services run for years without restarts but then we have this subprocess.Popen scenario:

In the run method of the worker thread, I am using:

class workerThread(threading.Thread):
    def __init__(self) :
        super(workerThread, self).__init__()
    ...
    def run(self)
        ...
        atempfile = tempfile.NamedTempFile(delete=False)

        myprocess = subprocess.Popen( ['third-party-cmd', 'with', 'arguments'],  shell=False, stdin=subprocess.PIPE, stdout=atempfile, stderr=subprocess.STDOUT,close_fds=True)
        ...

I need to use myprocess.poll() to check for process termination because I need to scan the atempfile until I find relevant information (the file may be > 1 GiB) and I need to terminate the process because of user request or because the process has been running too long. Once I find what I am looking for, I will stop checking the stdout temp file. I will clean it up after the external process is dead and before the worker thread terminates. I need the stdin PIPE in case I need to inject a response to something interactive in the child's stdin stream.

In my main program, I set a SIGINT and SIGTERM handler for me to perform cleanup, if my main python program is terminated with SIGTERM or SIGINT(Ctrl-C) if running from the shell.

Does anyone have a solid 2.x recipe for child signal handling in threads? ctypes sigprocmask, etc.

Any help would be very appreciated. I am just looking for an 'official' recipe or the BEST hack, if one even exists.

Notes

I am using a restricted build of Python. I must use 2.7.3. Third-party-cmd is a program I do not have source for - modifying it is not possible.

4

1 回答 1

1

你的描述中有很多看起来很奇怪的东西。首先,你有几个不同的线程和进程。谁在崩溃,谁在接收 SIGTERM,谁在接收 SIGKILL 以及由于哪些操作?

第二:为什么你的父母会收到 SIGTERM ?它不能被隐式发送。有人直接或间接地向您的父进程调用 kill(例如,通过杀死整个父组)。

第三点:当您处理 SIGTERM 时,您的程序如何终止?根据定义,如果未处理,程序将终止。如果已处理,则不会终止。到底发生了什么?

建议:

    $ cat crsh.c
    #include <stdio.h>

    int main(void)
    {
        int *f = 0x0;

        puts("Crashing");
        *f = 0;
        puts("Crashed");
        return 0;
    }
    $ cat a.py

    import subprocess, sys

    print('begin')
    p = subprocess.Popen('./crsh')
    a = raw_input()
    print(a)
    p.wait()
    print('end')
    $ python a.py 
    begin
    Crashing
    abcd
    abcd
    end

这行得通。没有信号传递给父级。您是否在程序中隔离了问题?

如果问题是发送到多个进程的信号:您可以使用 setpgid 为子进程设置单独的进程组吗?

创建临时文件有什么理由吗?它是在您的临时目录中创建的 1 GB 文件。为什么不管道 stdout ?

如果您确实确定需要在父程序中处理信号(例如,为什么不尝试/除了 KeyboardInterrupt?):多线程程序的 signal() 未指定行为是否会导致这些问题(例如,调度给不处理信号的线程的信号)?

NOTES
     The effects of signal() in a multithreaded process are unspecified.

无论如何,尝试更准确地解释程序的线程和进程是什么,它们做什么,信号处理程序是如何设置的以及为什么,谁在发送信号,谁在接收等等等等等等.

于 2013-03-12T02:56:14.590 回答