我面临的问题与如何解决 HDFStore Exception: cannot find the correct atom type中提出的问题相同。
我将其简化为 pandas 文档Storing Mixed Types in a Table 中给出的示例。
这个例子的重点是append
aDataFrame
有一些缺失值到 a HDFStore
。当我使用示例代码时,我最终得到一个atom type error
.
df_mixed
Out[103]:
A B C bool datetime64 int string
0 -0.065617 -0.062644 -0.004758 True 2001-01-02 00:00:00 1 string
1 1.444643 1.664311 -0.189095 True 2001-01-02 00:00:00 1 string
2 0.569412 -0.077504 -0.125590 True 2001-01-02 00:00:00 1 string
3 NaN NaN 0.563939 True NaN 1 NaN
4 NaN NaN -0.618218 True NaN 1 NaN
5 NaN NaN 1.477307 True NaN 1 NaN
6 -0.287331 0.984108 -0.514628 True 2001-01-02 00:00:00 1 string
7 -0.244192 0.239775 0.861359 True 2001-01-02 00:00:00 1 string
store=HDFStore('df.h5')
store.append('df_mixed', df_mixed, min_itemsize={'values':50})
...
Exception: cannot find the correct atom type -> [dtype->object,items->Index([datetime64, string], dtype=object)] object of type 'Timestamp' has no len()
如果我按照链接帖子(杰夫的回答)中的建议强制执行dtype
有问题的类型(实际上是object
那些类型),我仍然会遇到相同的错误。我在这里想念什么?
dtypes = [('datetime64', '|S20'), ('string', '|S20')]
store=HDFStore('df.h5')
store.append('df_mixed', df_mixed, dtype=dtypes, min_itemsize={'values':50})
...
Exception: cannot find the correct atom type -> [dtype->object,items->Index([datetime64, string], dtype=object)] object of type 'Timestamp' has no len()
感谢您的见解
解决了
我正在使用pandas
0.10 并切换到0.11-dev。正如 Jeff 推断的那样,问题在于NaN 与 NaT。
出品的前熊猫版
df_mixed.ix[3:5,['A', 'B', 'string', 'datetime64']] = np.nan such that
2 0.569412 -0.077504 -0.125590 True 2001-01-02 00:00:00 1 string
3 NaN NaN 0.563939 True NaN 1 NaN
而后一个版本
2 0.569412 -0.077504 -0.125590 True 2001-01-02 00:00:00 1 string
3 NaN NaN 0.563939 True NaT 1 NaN