1

最近,我被要求让“我们的 C++ 库在云中工作”。基本上,该库是计算机密集型的(计算价格),所以它是有道理的。我已经构建了一个 SWIG 接口来制作一个 python 版本,并考虑将 MapReduce 与 MRJob 一起使用。我想序列化文件中的对象,并使用映射器,反序列化并计算价格。

例如:

class MRTest(MRJob):
    def mapper(self,key,value):
        obj = dill.loads(value)
        yield (key, obj.price())

但是现在我走到了死胡同,因为 dill 似乎无法处理 SWIG 扩展:

PicklingError: Can't pickle <class 'SwigPyObject'>: it's not found as builtins.SwigPyObject

有没有办法让它正常工作?

4

1 回答 1

3

I'm the dill author. That's correct, dill can't pickle C++ objects. When you see it's not found as builtin.some_object… that almost invariably means that you are trying to pickle some object that is not written in python, but uses python to bind to C/C++ (i.e. an extension type). You have no hope of directly pickling such objects with a python serializer.

However, since you are interested in pickling a subclass of an extension type, you can actually do it. All you will need to do is to give your object the appropriate state you want to save as an instance attribute or attributes, and provide a __reduce__ method to tell dill (or pickle) how to save the state of your object. This method is how python deals with serializing extension types. See: https://docs.python.org/2/library/pickle.html#pickling-and-unpickling-extension-types

There are probably better examples, but here's at least one example: https://stackoverflow.com/a/19874769/4646678

于 2015-08-31T15:48:13.983 回答