我想计算每周回报,但从结束日期开始计算。这是我最初尝试使用 pandas 实现它:
import pandas as pd
import numpy as np
from pandas.tseries.offsets import BDay
index = pd.date_range(start='2020-09-13', end='2020-10-13', freq=BDay())
index_len = len(index)
dfw = pd.DataFrame(data=np.arange(start=1, stop=1+(index_len-1)*0.002, step=0.002),
index=index,
columns=['col1'])
def weekly_ret(x):
if x.size > 0:
print(f"range is {x.index[0]} - {x.index[-1]}")
return (x.iloc[-1] - x.iloc[0]) / x.iloc[0]
else:
return np.nan
dfw = dfw.resample(rule='5B').apply(weekly_ret)
print(dfw)
然后我得到以下输出,但这不是我想要的:
range is 2020-09-14 00:00:00 - 2020-09-18 00:00:00
range is 2020-09-21 00:00:00 - 2020-09-25 00:00:00
range is 2020-09-28 00:00:00 - 2020-10-02 00:00:00
range is 2020-10-05 00:00:00 - 2020-10-09 00:00:00
range is 2020-10-12 00:00:00 - 2020-10-13 00:00:00
col1
2020-09-14 0.008000
2020-09-21 0.007921
2020-09-28 0.007843
2020-10-05 0.007767
2020-10-12 0.001923
我希望它从2020-10-13
向后开始,以便最后一个范围是:
range is 2020-10-07 00:00:00 - 2020-10-13 00:00:00
代替:
range is 2020-10-12 00:00:00 - 2020-10-13 00:00:00
到目前为止我已经尝试过:
- 反转数据框
dfw = dfw.reindex(index=dfw.index[::-1])
- 上面的步骤 #1 加上规则是
-5B
,这会导致错误。 - 使用 resample 函数的 origin 参数,但这对计算的顺序没有影响,即
origin=dfw.index[-1]
- 上面的步骤 #1 加上计算倒置数据帧上的每行数,
dfw = dfw.rolling(5).apply(weekly_ret)[::5]
但在这里我得到了第一个(最后一个)间隔的 NaN,这个解决方案也有点浪费。
更新:这将是想要的输出;请注意,最后一次返回考虑的是从索引中最后一天开始的那一周:
range is 2020-09-16 00:00:00 - 2020-09-22 00:00:00 = 0.007968127490039847
range is 2020-09-23 00:00:00 - 2020-09-29 00:00:00 = 0.00788954635108482
range is 2020-09-30 00:00:00 - 2020-10-06 00:00:00 = 0.007812500000000007
range is 2020-10-07 00:00:00 - 2020-10-13 00:00:00 = 0.00773694390715668
col1
2020-09-22 0.007968
2020-09-29 0.007890
2020-10-06 0.007813
2020-10-13 0.007737 i.e. (1.042 - 1.034)/1.034