python - 带有滚动面具的熊猫滚动平均值/不包括条目

Question

我有一个带有这样的时间索引的熊猫数据框

import pandas as pd
import numpy as np

idx = pd.date_range(start='2000',end='2001')
df = pd.DataFrame(np.random.normal(size=(len(idx),2)),index=idx)

看起来像这样：

                   0            1
2000-01-01  0.565524    0.355548
2000-01-02  -0.234161   0.888384

我想计算一个滚动平均值

df_avg = df.rolling(60).mean()

但始终排除对应于（假设）10 天前 +- 2 天的条目。换句话说，对于每个日期，df_avg 应该包含前 60 个条目的平均值（带有 ewm 或 flat 的指数），但不包括从 t-48 到 t-52 的条目。我想我应该做一种滚动面具，但我不知道怎么做。我也可以尝试计算两个单独的平均值并获得结果作为差异，但它看起来很脏，我想知道是否有更好的方法可以推广到其他非线性计算......

非常感谢！

score 2 · Accepted Answer

您可以使用apply来自定义您的功能：

# select indexes you want to average over
avg_idx = [idx for idx in range(60) if idx not in range(8, 13)]

# do rolling computation, calculating average only on the specified indexes
df_avg = df.rolling(60).apply(lambda x: x[avg_idx].mean())

apply 中的xDataFrame 将始终有 60 行，因此您可以基于此指定位置索引，知道第一个条目 (0) 是t-60.

我不完全确定您的排除逻辑，但您可以轻松地为您的案例修改我的解决方案。

score 0 · Accepted Answer

不幸的是，没有。来自熊猫源代码：

df.rolling(window, min_periods=None, freq=None, center=False, win_type=None, 
           on=None, axis=0, closed=None)

window : int, or offset
    Size of the moving window. This is the number of observations used for
    calculating the statistic. Each window will be a fixed size.

    If its an offset then this will be the time period of each window. Each
    window will be a variable sized based on the observations included in
    the time-period.

python - 带有滚动面具的熊猫滚动平均值/不包括条目

2 回答 2

Related

Reference