我有一个 Python 程序(准确地说,是一个 Django 应用程序),它使用subprocess.Popen
. 由于我的应用程序的体系结构限制,我无法Popen.terminate()
用来终止子进程并Popen.poll()
检查进程何时终止。这是因为我无法在变量中保存对已启动子进程的引用。
相反,我必须在子进程启动时将进程 ID 写入pid
文件pidfile
。当我想停止子进程时,我打开它pidfile
并使用os.kill(pid, signal.SIGTERM)
它来停止它。
我的问题是:我怎样才能知道子进程何时真正终止?调用后使用signal.SIGTERM
它大约需要 1-2 分钟才能最终终止os.kill()
。首先,我认为这os.waitpid()
对这项任务来说是正确的,但是当我在os.kill()
它给我之后调用它时OSError: [Errno 10] No child processes
。
顺便说一句,我正在使用两种表单从 HTML 模板启动和停止子流程,程序逻辑位于 Django 视图中。当我的应用程序处于调试模式时,异常会显示在我的浏览器中。知道我在视图中调用的子进程 ( python manage.py crawlwebpages
) 本身调用另一个子进程(即 Scrapy 爬虫的实例)可能也很重要。我将pid
这个 Scrapy 实例写入pidfile
,这就是我想要终止的。
以下是相关代码:
def process_main_page_forms(request):
if request.method == 'POST':
if request.POST['form-type'] == u'webpage-crawler-form':
template_context = _crawl_webpage(request)
elif request.POST['form-type'] == u'stop-crawler-form':
template_context = _stop_crawler(request)
else:
template_context = {
'webpage_crawler_form': WebPageCrawlerForm(),
'stop_crawler_form': StopCrawlerForm()}
return render(request, 'main.html', template_context)
def _crawl_webpage(request):
webpage_crawler_form = WebPageCrawlerForm(request.POST)
if webpage_crawler_form.is_valid():
url_to_crawl = webpage_crawler_form.cleaned_data['url_to_crawl']
maximum_pages_to_crawl = webpage_crawler_form.cleaned_data['maximum_pages_to_crawl']
program = 'python manage.py crawlwebpages' + ' -n ' + str(maximum_pages_to_crawl) + ' ' + url_to_crawl
p = subprocess.Popen(program.split())
template_context = {
'webpage_crawler_form': webpage_crawler_form,
'stop_crawler_form': StopCrawlerForm()}
return template_context
def _stop_crawler(request):
stop_crawler_form = StopCrawlerForm(request.POST)
if stop_crawler_form.is_valid():
with open('scrapy_crawler_process.pid', 'rb') as pidfile:
process_id = int(pidfile.read().strip())
print 'PROCESS ID:', process_id
os.kill(process_id, signal.SIGTERM)
os.waitpid(process_id, os.WNOHANG) # This gives me the OSError
print 'Crawler process terminated!'
template_context = {
'webpage_crawler_form': WebPageCrawlerForm(),
'stop_crawler_form': stop_crawler_form}
return template_context
我能做些什么?非常感谢!
编辑:
根据Jacek Konieczny给出的出色回答,我可以通过将函数中的代码更改为以下内容来解决我的问题:_stop_crawler(request)
def _stop_crawler(request):
stop_crawler_form = StopCrawlerForm(request.POST)
if stop_crawler_form.is_valid():
with open('scrapy_crawler_process.pid', 'rb') as pidfile:
process_id = int(pidfile.read().strip())
# These are the essential lines
os.kill(process_id, signal.SIGTERM)
while True:
try:
time.sleep(10)
os.kill(process_id, 0)
except OSError:
break
print 'Crawler process terminated!'
template_context = {
'webpage_crawler_form': WebPageCrawlerForm(),
'stop_crawler_form': stop_crawler_form}
return template_context