python - 如何从熊猫中读取 HDF 表？

Question

我有一个my_file.h5文件，大概包含 HDF5 格式（PyTables）的数据。我尝试使用 pandas 读取此文件：

import pandas as pd
store = pd.HDFStore('my_file.h5')

然后我尝试使用该store对象：

print store

结果我得到：

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/pymodules/python2.7/pandas/io/pytables.py", line 133, in __repr__
    kind = v._v_attrs.pandas_type
  File "/usr/lib/python2.7/dist-packages/tables/attributeset.py", line 302, in __getattr__
    (name, self._v__nodePath)
AttributeError: Attribute 'pandas_type' does not exist in node: '/data'

有人知道我在做什么错吗？问题是否是由于 my*.h5不是我认为的那样（不是 hdf5 格式的数据）？

score 3 · Accepted Answer

在您的/usr/lib/pymodules/python2.7/pandas/io/pytables.py第 133 行

kind = v._v_attrs.pandas_type

在我的pytables.py我看到

kind = getattr(n._v_attrs,'pandas_type',None)

通过使用getattr，如果没有pandas_type属性，则kind设置为None。我猜我的熊猫版本

In [7]: import pandas as pd

In [8]: pd.__version__
Out[8]: '0.10.0'

比你的新。如果是这样，解决方法是升级您的pandas.

score 2 · Accepted Answer

我有一个 h5 表。使用独立于 pandas 的 pytables 制作，需要将其转换为元组列表，然后将其导入 df。这很好，因为它允许我利用我的 pytables 索引在输入上运行“位置”。这可以节省我阅读所有行的时间。

python - 如何从熊猫中读取 HDF 表？

2 回答 2

Related

Reference