6

I have a dict containing several pandas Dataframe (identified by keys) , any suggestion to effectively serialize (and cleanly load) it . Here is the structure (a pprint display output ). Each of dict['method_x_']['meas_x_'] is a pandas Dataframe. The goal is to save the dataframes for a further plotting with some specific plotting options.

{'method1':

{'meas1':

                          config1   config2
                   0      0.193647  0.204673
                   1      0.251833  0.284560
                   2      0.227573  0.220327,
'meas2':   
                          config1   config2
                   0      0.172787  0.147287
                   1      0.061560  0.094000
                   2      0.045133  0.034760,

'method2':

{ 'meas1':

                          congif1   config2
                   0      0.193647  0.204673
                   1      0.251833  0.284560
                   2      0.227573  0.220327,

'meas2':

                          config1   config2
                   0      0.172787  0.147287
                   1      0.061560  0.094000
                   2      0.045133  0.034760}}
4

2 回答 2

6

使用pickle.dump(s) 和 pickle.load(s)。它确实有效。Pandas DataFrames 也有自己的方法 df.save("filename") 可用于序列化单个 DataFrame...

于 2013-07-28T11:59:49.327 回答
1

在我的特定用例中,我尝试做一个简单的pickle.dump(all_df, open("all_df.p","wb"))

虽然它正确加载>all_df = pickle.load(open("all_df.p","rb"))

当我重新启动我的木星环境时,我会得到一个UnpicklingError: invalid load key, '\xef'.

这里描述的一种方法表明我们可以使用HDF5 (pytables)来完成这项工作。从他们的文档中:

HDFStore 是一个类似于 dict 的对象,可以读写 pandas

tables但它似乎对您使用的版本很挑剔。我在pip install --upgrade tables运行时重新启动后让我的工作。

如果您需要有关如何使用它的总体思路:

#consider all_df as a list of dataframes
with pd.HDFStore('df_store.h5') as df_store:
    for i in all_df.keys():
        df_store[i] = all_df[i]

您应该有一个df_store.h5文件,您可以使用相反的过程将其转换回来:

new_all_df = dict()
with pd.HDFStore('df_store.h5') as df_store:
    for i in df_store.keys():
        new_all_df[i] = df_store[i]
于 2020-11-02T20:50:15.663 回答