3

如何将数据从一个管道馈送到三个不同的进程?

nulfp = open(os.devnull, "w")

piper = Popen([
    "come command",
    "some params"
], stdout = PIPE, stderr = nulfp.fileno())

pipe_consumer_1 = Popen([
    "come command",
    "some params"
], stdin = piper.stdout, stderr = nulfp.fileno())

pipe_consumer_2 = Popen([
    "come command",
    "some params"
], stdin = piper.stdout, stderr = nulfp.fileno())

pipe_consumer_3 = Popen([
    "come command",
    "some params"
], stdin = piper.stdout, stderr = nulfp.fileno())

pipe_consumer_1.communicate()
pipe_consumer_2.communicate()
pipe_consumer_3.communicate()
piper.communicate()

如果我运行上面的代码,它将产生一个损坏的文件。这意味着管道使用者可能没有从吹笛者那里读取完整的输出。

这个可以正常工作,但速度要慢得多:

nulfp = open(os.devnull, "w")

piper_1 = Popen([
    "come command",
    "some params"
], stdout = PIPE, stderr = nulfp.fileno())

piper_2 = Popen([
    "come command",
    "some params"
], stdout = PIPE, stderr = nulfp.fileno())

piper_3 = Popen([
    "come command",
    "some params"
], stdout = PIPE, stderr = nulfp.fileno())

pipe_consumer_1 = Popen([
    "come command",
    "some params"
], stdin = piper_1.stdout, stderr = nulfp.fileno())

pipe_consumer_2 = Popen([
    "come command",
    "some params"
], stdin = piper_2.stdout, stderr = nulfp.fileno())

pipe_consumer_3 = Popen([
    "come command",
    "some params"
], stdin = piper_3.stdout, stderr = nulfp.fileno())

pipe_consumer_1.communicate()
pipe_consumer_2.communicate()
pipe_consumer_3.communicate()
piper_1.communicate()
piper_2.communicate()
piper_3.communicate()

有什么建议可以使第一个代码片段与第二个代码片段的工作方式相同吗?如果我得到第一种工作方法,该过程将在 1/3 的时间内完成。

4

2 回答 2

2

这仅使用单字节“块”,但您明白了。

from subprocess import Popen, PIPE

cat_proc = '/usr/bin/cat'

consumers = (Popen([cat_proc], stdin = PIPE, stdout = open('consumer1', 'w')),
             Popen([cat_proc], stdin = PIPE, stdout = open('consumer2', 'w')),
             Popen([cat_proc], stdin = PIPE, stdout = open('consumer3', 'w'))
)


with open('inputfile', 'r') as infile:
   for byte in infile:
       for consumer in consumers:
           consumer.stdin.write(byte)

测试时,消费者输出文件与输入文件匹配。

编辑:这是从具有 1K 块的进程中读取的。

from subprocess import Popen, PIPE

cat_proc = '/usr/bin/cat'

consumers = (Popen([cat_proc], stdin = PIPE, stdout = open('consumer1', 'w')),
             Popen([cat_proc], stdin = PIPE, stdout = open('consumer2', 'w')),
             Popen([cat_proc], stdin = PIPE, stdout = open('consumer3', 'w'))
)

producer = Popen([cat_proc, 'inputfile'], stdout = PIPE)

while True:
    byte = producer.stdout.read(1024)
    if not byte: break
    for consumer in consumers:
        consumer.stdin.write(byte)
于 2012-07-19T13:28:24.610 回答
1

管道中的数据只能读取一次,一旦读取就会从缓冲区中删除。这意味着所有消费者进程都只能看到数据的随机部分,当它们组合在一起时,将提供完整的流。当然,这对你来说不是很有用。

您可以让生产者进程写入subprocess.PIPE,从这个管道中读取块到缓冲区,然后将此缓冲区写入所有消费者进程。这意味着您必须自己完成所有缓冲区处理。使用它来为您完成这项工作可能更容易tee——我将很快发布一些示例代码。

于 2012-07-19T13:46:33.010 回答