python - 无法使用 DataFrame.eval() 减去 datetime64

Question

给定一个带有几个时间戳的 DataFrame：

In [88]: df.dtypes
Out[88]:
Time             datetime64[ns]
uniqstime        datetime64[ns]
dtype: object

如果我打电话eval()，我会收到一个类型错误：

In [91]: df.eval('since = Time - uniqstime')
...

ValueError: unkown type timedelta64[ns]

（顺便说一句，错误消息中的“未知”拼写错误。）

但我可以使用 Python 表示法：

In [92]: df['since'] = df.Time - df.uniqstime

timedelta在 numexpr中分配 a 有问题吗？

score 3 · Accepted Answer

这已经是 github 上的问题（尽管已关闭），请参见此处：https ://github.com/pydata/pandas/issues/5007

目前不支持。然而，它们并不是真正的 ATM 优势，因为无论如何这些计算都是在 python 空间中完成的。

score 3 · Accepted Answer

除非您只想使您的代码更短且更具可读性（一个值得称赞的目标），否则numexpr必须支持timedelta64操作才能获得性能优势。正如@Jeff 所说，这些（和datetime64操作）在 Python 空间中进行评估，因为numexpr不支持pandas NaT（Not- a- Time）。但是，非timedelta64操作是使用评估的，numexpr因此您可能必须拥有一个非常大的timedelta64数组才能创建瓶颈。

score 0 · Accepted Answer

从 pandas 开始0.23，您可以通过将engine参数设置为来做到这一点python，例如：

df.eval('since = Time - uniqstime', engine='python')

从熊猫文档pandas.eval：

engine : string or None, default 'numexpr', {'python', 'numexpr'}
    The engine used to evaluate the expression. Supported engines are
    - None         : tries to use ``numexpr``, falls back to ``python``
    - ``'numexpr'``: This default engine evaluates pandas objects using
                     numexpr for large speed ups in complex expressions
                     with large frames.
    - ``'python'``: Performs operations as if you had ``eval``'d in top
                    level python. This engine is generally not that useful.
    More backends may be available in the future.

我不同意它“没那么有用”的说法。在我看来，它可以缩短执行某些操作所需的代码，有时它可能会派上用场。

python - 无法使用 DataFrame.eval() 减去 datetime64

3 回答 3

Related

Reference