假设您在 Linux 上运行 Django,并且您有一个视图,并且您希望该视图从名为cmd的子进程返回数据,该子进程对视图创建的文件进行操作,例如:
def call_subprocess(request):
response = HttpResponse()
with tempfile.NamedTemporaryFile("W") as f:
f.write(request.GET['data']) # i.e. some data
# cmd operates on fname and returns output
p = subprocess.Popen(["cmd", f.name],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
out, err = p.communicate()
response.write(p.out) # would be text/plain...
return response
现在,假设cmd的启动时间很慢,但运行时间非常快,而且它本身没有守护程序模式。我想改进这个视图的响应时间。
我想通过在工作池中启动多个cmd实例,让它们等待输入,并让call_process询问其中一个工作池进程处理数据,从而使整个系统运行得更快。
这实际上是 2 个部分:
第 1 部分。调用cmd和cmd等待输入的函数。这可以通过管道来完成,即
def _run_subcmd():
p = subprocess.Popen(["cmd", fname],
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out, err = p.communicate()
# write 'out' to a tmp file
o = open("out.txt", "W")
o.write(out)
o.close()
p.close()
exit()
def _run_cmd(data):
f = tempfile.NamedTemporaryFile("W")
pipe = os.mkfifo(f.name)
if os.fork() == 0:
_run_subcmd(fname)
else:
f.write(data)
r = open("out.txt", "r")
out = r.read()
# read 'out' from a tmp file
return out
def call_process(request):
response = HttpResponse()
out = _run_cmd(request.GET['data'])
response.write(out) # would be text/plain...
return response
第 2 部分。一组在后台运行并等待数据的工作人员。即我们想扩展上面的,以便子进程已经在运行,例如当 Django 实例初始化时,或者这个call_process被第一次调用时,一组这些工人被创建
WORKER_COUNT = 6
WORKERS = []
class Worker(object):
def __init__(index):
self.tmp_file = tempfile.NamedTemporaryFile("W") # get a tmp file name
os.mkfifo(self.tmp_file.name)
self.p = subprocess.Popen(["cmd", self.tmp_file],
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
self.index = index
def run(out_filename, data):
WORKERS[self.index] = Null # qua-mutex??
self.tmp_file.write(data)
if (os.fork() == 0): # does the child have access to self.p??
out, err = self.p.communicate()
o = open(out_filename, "w")
o.write(out)
exit()
self.p.close()
self.o.close()
self.tmp_file.close()
WORKERS[self.index] = Worker(index) # replace this one
return out_file
@classmethod
def get_worker() # get the next worker
# ... static, incrementing index
应该在某处对工作人员进行一些初始化,如下所示:
def init_workers(): # create WORKERS_COUNT workers
for i in xrange(0, WORKERS_COUNT):
tmp_file = tempfile.NamedTemporaryFile()
WORKERS.push(Worker(i))
现在,我上面的内容变成了这样:
def _run_cmd(data):
Worker.get_worker() # this needs to be atomic & lock worker at Worker.index
fifo = open(tempfile.NamedTemporaryFile("r")) # this stores output of cmd
Worker.run(fifo.name, data)
# please ignore the fact that everything will be
# appended to out.txt ... these will be tmp files, too, but named elsewhere.
out = fifo.read()
# read 'out' from a tmp file
return out
def call_process(request):
response = HttpResponse()
out = _run_cmd(request.GET['data'])
response.write(out) # would be text/plain...
return response
现在,问题:
这行得通吗?(我刚刚将这个从头顶输入到 StackOverflow,所以我确信存在问题,但从概念上讲,它会起作用)
要寻找哪些问题?
有更好的选择吗?例如,线程是否也可以正常工作(它是 Debian Lenny Linux)?有没有像这样处理并行进程工作池的库?
我应该注意与 Django 的交互吗?
谢谢阅读!我希望你和我一样觉得这个问题很有趣。
布赖恩