我cPickle
用来序列化用于记录的数据。
我希望能够将任何我想要的东西扔进一个对象中,然后序列化它。通常这对 . 来说很好cPickle
,但只是遇到了一个问题,我想序列化的对象之一包含一个函数。这导致cPickle
引发异常。
我宁愿cPickle
跳过它无法处理的东西,而不是导致整个过程崩溃。
什么是实现这一目标的好方法?
我假设您正在寻找一个尽力而为的解决方案,如果未腌制的结果无法正常运行,您也可以。
对于您的特定用例,您可能希望为函数对象注册一个 pickle 处理程序。只需将其设置为足以满足您的最大努力目的的虚拟处理程序即可。为函数制作处理程序是可能的,这相当棘手。为避免影响其他腌制代码,您可能希望在退出日志记录代码时取消注册处理程序。
这是一个示例(没有任何注销):
import cPickle
import copy_reg
from types import FunctionType
# data to pickle: note that o['x'] is a lambda and they
# aren't natively picklable (at this time)
o = {'x': lambda x: x, 'y': 1}
# shows that o is not natively picklable (because of
# o['x'])
try:
cPickle.dumps(o)
except TypeError:
print "not natively picklable"
else:
print "was pickled natively"
# create a mechanisms to turn unpickable functions int
# stub objects (the string "STUB" in this case)
def stub_pickler(obj):
return stub_unpickler, ()
def stub_unpickler():
return "STUB"
copy_reg.pickle(
FunctionType,
stub_pickler, stub_unpickler)
# shows that o is now picklable but o['x'] is restored
# to the stub object instead of its original lambda
print cPickle.loads(cPickle.dumps(o))
它打印:
not natively picklable
{'y': 1, 'x': 'STUB'}
或者,尝试cloudpickle
:
>>> import cloudpickle
>>> squared = lambda x: x ** 2
>>> pickled_lambda = cloudpickle.dumps(squared)
>>> import pickle
>>> new_squared = pickle.loads(pickled_lambda)
>>> new_squared(2)
4
pip install cloudpickle
并实现你的梦想。dask、IPython parallel 和 PySpark 实现了同样的梦想。