我试图允许在 YAML 文件中定义 pandas DataFrame 对象,我相信这应该是可能的,因为 DataFrame 对象是pickleable。
我的精简 YAML 文件如下,保存为“config.yaml”:
!!python/object/new:pandas.DataFrame [[{'dimension1_id':58,'metric1':10},{'dimension1_id':50,'metric':10}]]
我正在使用以下内容将数据加载到我的 python 脚本中
f = open('config.yaml')
y = yaml.load(f)
print y
输出(减少)如下:
File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 2085, in __getattr__
if name in self.columns:
File "properties.pyx", line 55, in pandas.lib.AxisProperty.__get__ (pandas\lib.c:29240)
RuntimeError: maximum recursion depth exceeded while calling a Python object
我使用PyYAML 文档作为我唯一的信息来源。
谁能猜出为什么 pandas 会陷入无限循环?
编辑:默认情况下,似乎 DataFrames 对象不可序列化,额外的工作看起来比它的价值更麻烦。这是 yaml_serializer 从一个简单的 DataFrame 对象创建的 YAML 文件:
!!python/object/new:pandas.core.frame.DataFrame
state: !!python/object/new:pandas.core.internals.BlockManager
state:
- - !!python/object/apply:numpy.core.multiarray._reconstruct
args:
- &id001 !!python/name:pandas.core.index.Index ''
- [0]
- b
state:
- - 1
- [!!python/long '2']
- &id002 !dtype 'object'
- false
- [dfsd, id]
- [null]
- !!python/object/apply:numpy.core.multiarray._reconstruct
args:
- !!python/name:pandas.core.index.Int64Index ''
- [0]
- b
state:
- - 1
- [!!python/long '2']
- !dtype 'int64'
- false
- "\0\0\0\0\0\0\0\0\x01\0\0\0\0\0\0\0"
- [null]
- - - [!!python/long '23', !!python/long '123']
- [!!python/long '7', !!python/long '123']
- - !!python/object/apply:numpy.core.multiarray._reconstruct
args:
- *id001
- [0]
- b
state:
- - 1
- [!!python/long '2']
- *id002
- false
- [dfsd, id]
- [null]