0

我终于从以下讨论中了解了如何用 dill 替换 pickle 的示例:pickle-dill。例如,以下代码对我有用

import os
import dill
import multiprocessing

def run_dill_encoded(what):
    fun, args = dill.loads(what)
    return fun(*args)

def apply_async(pool, fun, args):
    return pool.apply_async(run_dill_encoded, (dill.dumps((fun, args)),))

if __name__ == '__main__':

    pool = multiprocessing.Pool(5)
    results = [apply_async(pool, lambda x: x*x, args=(x,)) for x in range(1,7)]
    output = [p.get() for p in results]
    print(output)

我试图将相同的理念应用于 pymongo。以下代码

import os
import dill
import multiprocessing
import pymongo

def run_dill_encoded(what):
    fun, args = dill.loads(what)
    return fun(*args)


def apply_async(pool, fun, args):
    return pool.apply_async(run_dill_encoded, (dill.dumps((fun, args)),))


def write_to_db(value_to_insert):           
    client = pymongo.MongoClient('localhost',  27017)
    db = client['somedb']
    collection = db['somecollection']
    result = collection.insert_one({"filed1": value_to_insert})
    client.close()

if __name__ == '__main__':
    pool = multiprocessing.Pool(5)
    results = [apply_async(pool, write_to_db, args=(x,)) for x in ['one', 'two', 'three']]
    output = [p.get() for p in results]
    print(output)

产生错误:

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\Python34\lib\multiprocessing\pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "C:\...\temp2.py", line 10, in run_dill_encoded
    return fun(*args)
  File "C:\...\temp2.py", line 21, in write_to_db
    client = pymongo.MongoClient('localhost',  27017)
NameError: name 'pymongo' is not defined
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:/.../temp2.py", line 32, in <module>
    output = [p.get() for p in results]
  File "C:/.../temp2.py", line 32, in <listcomp>
    output = [p.get() for p in results]
  File "C:\Python34\lib\multiprocessing\pool.py", line 599, in get
    raise self._value
NameError: name 'pymongo' is not defined

Process finished with exit code 1

怎么了?

4

1 回答 1

1

正如我在评论中提到的,您需要import pymongo在 function中放置一个write_to_db。这是因为当函数被序列化时,当它被运送到其他进程空间时,它不会携带任何全局引用。

于 2016-04-21T13:50:48.263 回答