16

The question may seem a little basic, but wasn't able to find anything that I understood in the internet. How do I store something that I pickled with dill?

I have come this far for saving my construct (pandas DataFrame, which also contains custom classes):

import dill
dill_file = open("data/2017-02-10_21:43_resultstatsDF", "wb")
dill_file.write(dill.dumps(resultstatsDF))
dill_file.close()

and for reading

dill_file = open("data/2017-02-10_21:43_resultstatsDF", "rb")
resultstatsDF_out = dill.load(dill_file.read())
dill_file.close()

but I when reading I get the error

TypeError: file must have 'read' and 'readline' attributes

How do I do this?


EDIT for future readers: After having used this approach (to pickle my DataFrame) for while, now I refrain from doing so. As it turns out, different program versions (including objects that might be stored in the dill file) might result in not being able to recover the pickled file. Now I make sure that everything that I want to save, can be expressed as a string (as efficiently as possible) -- actually a human readable string. Now, I store my data as CSV. Objects in CSV-cells might be represented by JSON format. That way I make sure that my files will be readable in the months and years to come. Even if code changes, I am able to rewrite encoders by parsing the strings and I am able to understand the CSV my inspecting it manually.

4

1 回答 1

30

只需给它没有以下文件的文件read

resultstatsDF_out = dill.load(dill_file)

你也可以像这样莳萝文件:

with open("data/2017-02-10_21:43_resultstatsDF", "wb") as dill_file:
    dill.dump(resultstatsDF, dill_file)

所以:

dill.dump(obj, open_file)

直接写入文件。然而:

dill.dumps(obj) 

序列化obj,您可以将其写入自己的文件。

同样地:

dill.load(open_file)

从文件中读取,并且:

dill.loads(serialized_obj)

构造一个序列化对象形式的对象,您可以从文件中读取该对象。

建议使用with语句打开文件。

这里:

with open(path) as fobj:
    # do somdthing with fobj

与以下效果相同:

fobj = open(path)
try:
    # do somdthing with fobj
finally:
    fobj.close()

一旦您离开with语句的缩进,该文件将立即关闭,即使在出现异常的情况下也是如此。

于 2017-02-10T20:50:52.090 回答