python - 并行进程

Question

我很想知道是否可以运行一个将函数调用为并行子进程的 python 脚本。我不确定我是否正确使用了这些术语，所以这里有一个概念脚本，它由一个 bash 脚本构成，它可以完成我所说的。

import Zfunctions as Z
reload(Z)

def Parallel():
    statements
    calls to other functions in a general function file Z

#--------------
if '__name__' == '__main__':
    # Running this script in a linux cluster with 8 processing node available
    Parallel() &  #1st process sent to 1st processing node
    Parallell() & #2nd process sent to 2nd node
    .
    .
    .
    Parallell() & #8th process sent to 8th node
    wait

现在我知道＆符号（&）和“等待”在这里是错误的，但在 bash 中，这是将进程发送到后台并等待这些进程完成的方式。我现在的问题是，希望更清楚：这可以在 python 中完成，如果可以，怎么做？

任何帮助表示赞赏。

/米

我得到了一些很好的帮助。我在上面的问题中测试了这个修改，它尝试运行 60 个作业，这些作业将处理大量数据并将结果写入磁盘。所有这些都在一个 Python 文件中，该文件结合了两个 for 循环和一系列内部函数调用。脚本失败，错误输出如下：

import multiprocessing

def Parallel(m,w,PROCESSES):                                                             
plist = {}                                                                           
plist['timespan'] = '2007-2008'                                                      
print 'Creating pool with %d processes\n' % PROCESSES                                
pool = multiprocessing.Pool(PROCESSES)                                               
print 'pool = %s' % pool                                                             

TASKS = [(LRCE,(plist,m,w)),(SRCE,(plist,m,w)),(ALBEDO,(plist,m,w)),                 
         (SW,(plist,m,w)),(RR,(plist,m,w)),(OLR,(plist,m,w)),(TRMM,(plist,w)),       
         (IWP,(plist,m,w)),(RH,(plist,'uth',m,w)),(RH,(plist,200,m,w)),              
         (RH,(plist,400,m,w)),(IWC,(plist,200,m,w)),(IWC,(plist,400,m,w)),           
         (CC,(plist,200,m,w)),(CC,(plist,400,m,w))]                                                                                                        

results = [pool.apply_async(calculate,t) for t in TASKS]                             
print 'Ordered results using pool.apply_async():'                                    
for r in results:                                                                    
    print '\t', r.get()                                                              

#-----------------------------------------------------------------------------------     
if __name__ == '__main__':                                                               
PROCESSES = 8                                                                        
for w in np.arange(2):                                                               
    for m in np.arange(2):                                                           
        Parallel(m,w,PROCESSES)

#### 来自集群的错误消息

线程 Thread-3 中的异常：回溯（最后一次调用）：文件“/software/apps/python/2.7.2-smhi1/lib/python2.7/threading.py”，第 552 行，在bootstrap_inner self.run( ) 文件“/software/apps/python/2.7.2-smhi1/lib/python2.7/threading.py”，第 505 行，运行 self.__target(*self.__args, **self.__kwargs) 文件“/ software/apps/python/2.7.2-smhi1/lib/python2.7/multiprocessing/pool.py"，第 313 行，在 _handle_tasks put(task) PicklingError: Can't pickle : attribute lookup __builtin .function failed

score 3 · Accepted Answer

您可能想研究多处理——您的代码可以按如下方式完成：

import multiprocessing

def Parallel(junk):    
    #...snip...

if __name__ == "__main__":
   p = multiprocessing.Pool(8)

   results = p.map(Parallel, range(8))

一个警告：不要在交互式解释器中尝试这个。

python - 并行进程

1 回答 1

Related

Reference