python - Python - Mutliprocess，类的成员函数

Question

我不知道这是因为我，还是因为 Python2.7 的多处理模块。谁能弄清楚为什么这不起作用？

from multiprocessing import pool as mp
class encapsulation:
   def __init__(self):
       self.member_dict = {}
   def update_dict(self,index,value):
       self.member_dict[index] = value
encaps = encapsulation()
def method(argument):
   encaps.update_dict(argument,argument)
   print encaps.member_dict
p = mp() #sets up multiprocess pool of processors
p.map(method,sys.argv[1:]) #method is the function, sys.argv is the list of arguments to multiprocess
print encaps.member_dict
>>>{argument:argument}
>>>{}

所以我的问题只是关于成员变量。我的理解是类封装应该在函数内部和外部保存这个字典。即使我只初始化过一次，为什么它会重置并给我一个空字典？请帮忙

score 2 · Accepted Answer

即使您封装了对象，多处理模块最终也会在每个进程中使用该对象的本地副本，并且永远不会真正将您的更改传播回给您。在这种情况下，您没有正确使用 Pool.map，因为它希望每个方法调用都返回一个结果，然后将其发送回您的返回值。如果您想要影响共享对象，那么您需要一个管理器，它将协调共享内存：

封装共享对象

from multiprocessing import Pool 
from multiprocessing import Manager
import sys

class encapsulation:
   def __init__(self):
       self.member_dict = {}
   def update_dict(self,index,value):
       self.member_dict[index] = value

encaps = encapsulation()

def method(argument):
   encaps.update_dict(argument,argument)
   # print encaps.member_dict       

manager = Manager()
encaps.member_dict = manager.dict()

p = Pool()
p.map(method,sys.argv[1:])

print encaps.member_dict

输出

$ python mp.py a b c
{'a': 'a', 'c': 'c', 'b': 'b'}

我建议不要真正将共享对象设置为成员属性，而是作为 arg 传入，或者封装共享对象本身，然后将其值传递给您的 dict。共享对象不能持久保存。它需要清空和丢弃：

# copy the values to a reg dict
encaps.member_dict = encaps.member_dict.copy()

但这可能会更好：

class encapsulation:
   def __init__(self):
       self.member_dict = {}
   # normal dict update
   def update_dict(self,d):
       self.member_dict.update(d)

encaps = encapsulation()

manager = Manager()
results_dict = manager.dict()

# pass in the shared object only
def method(argument):
   results_dict[argument] = argument    

p = Pool()
p.map(method,sys.argv[1:])

encaps.update_dict(results_dict)

按预期使用 pool.map

如果您使用地图返回值，它可能如下所示：

def method(argument):
   encaps.update_dict(argument,argument)
   return encaps.member_dict

p = Pool()
results = p.map(method,sys.argv[1:]) 
print results
# [{'a': 'a'}, {'b': 'b'}, {'c': 'c'}]

您需要再次将结果合并到您的 dict 中：

for result in results:
    encaps.member_dict.update(result)
print encaps.member_dict
# {'a': 'a', 'c': 'c', 'b': 'b'}

python - Python - Mutliprocess，类的成员函数

1 回答 1

封装共享对象

按预期使用 pool.map

Related

Reference