我正在做一个需要我从一些文件中提取大量信息的项目。关于项目的格式和大部分信息对于我要问的内容并不重要。我大多不明白如何与进程池中的所有进程共享这本字典。
这是我的代码(更改了变量名并删除了大部分代码,只需要知道部分):
import json
import multiprocessing
from multiprocessing import Pool, Lock, Manager
import glob
import os
def record(thing, map):
with mutex:
if(thing in map):
map[thing] += 1
else:
map[thing] = 1
def getThing(file, n, map):
#do stuff
thing = file.read()
record(thing, map)
def init(l):
global mutex
mutex = l
def main():
#create a manager to manage shared dictionaries
manager = Manager()
#get the list of filenames to be analyzed
fileSet1=glob.glob("filesSet1/*")
fileSet2=glob.glob("fileSet2/*")
#create a global mutex for the processes to share
l = Lock()
map = manager.dict()
#create a process pool, give it the global mutex, and max cpu count-1 (manager is its own process)
with Pool(processes=multiprocessing.cpu_count()-1, initializer=init, initargs=(l,)) as pool:
pool.map(lambda file: getThing(file, 2, map), fileSet1) #This line is what i need help with
main()
据我了解,该 lamda 功能应该可以工作。我需要帮助的行是:pool.map(lambda file: getThing(file, 2, map), fileSet1)。它在那里给我一个错误。给出的错误是“AttributeError: Cant pickle local object 'main..'”。
任何帮助,将不胜感激!