我过去在机器学习和数据挖掘方面做过一些类似的任务。在您的情况下使用multiprocessing
可能不是那么困难的任务。这取决于您热衷于制作程序的容忍度,您可以使用线程池模式。我个人最喜欢的是使用生产者-消费者模式Queue
,这种设计可以处理各种复杂的任务。这是一个示例玩具程序,使用multiprocessing
:
import multiprocessing
from multiprocessing import Queue, Process
from Queue import Empty as QueueEmpty
# Assuming this text is very very very very large
text="Here I am writing some nonsense\nBut people will read\n..."
def read(q):
"""Read the text and put in a queue"""
for line in text.split("\n"):
q.put(line)
def work(qi, qo):
"""Put the line into the queue out"""
while True:
try:
data = qi.get(timeout = 1) # Timeout after 1 second
qo.put(data)
except QueueEmpty:
return # Exit when all work is done
except:
raise # Raise all other errors
def join(q):
"""Join all the output queue and write to a text file"""
f = open("file.txt", w)
while True:
try:
f.write(q.get(timeout=1))
except QueueEmpty:
f.close()
return
except:
raise
def main():
# Input queue
qi = Queue()
# Output queue
qo = Queue()
# Start the producer
Process(target = read, args = (qi, )).start()
# Start 8 consumers
for i in range(8):
Process(target = work, args = (qi, qo, )).start()
# Final process to handle the queue out
Process(target = join, args = (qo, )).start()
凭记忆输入,如有错误请指正。:)