IIUC
使用重采样来添加信息('D' -> 'T')不是正确的选择,尤其是如果您想填充前向值。您可以使用np.vsplit
创建一个类似面板的然后根据您的 DatetimeIndex 重复您的数组,最后重塑数据:
# Create new MultiIndex
dates2 = pd.date_range(X.index.levels[0].min(),
X.index.levels[0].max() + pd.DateOffset(days=1),
freq='T', closed='left')
mi = pd.MultiIndex.from_product([dates2, X.index.levels[1]])
# Manipulate your array
vals = np.array(np.repeat(np.vsplit(X.values, len(X.index.levels[0])), 24*60, axis=0))
vals = vals.reshape(vals.shape[0]*vals.shape[1], vals.shape[2])
# New dataframe
out = pd.DataFrame(vals, index=mi, columns=X.columns)
对于较小的样本:
>>> df
A B
2012-01-01 A 11 12
B 13 14
2012-01-02 A 21 22
B 23 24
2012-01-03 A 31 32
B 33 34
# Resample: 12H and 2 values per day
# dates2 = pd.date_range(df.index.levels[0].min(), df.index.levels[0].max() + pd.DateOffset(days=1), freq='12H', closed='left')
# mi = pd.MultiIndex.from_product([dates2, df.index.levels[1]])
# vals = np.array(np.repeat(np.vsplit(df.values, len(df.index.levels[0])), 2, axis=0))
# vals = vals.reshape(vals.shape[0]*vals.shape[1], vals.shape[2])
# out = pd.DataFrame(vals, index=mi, columns=df.columns)
>>> out
A B
2012-01-01 00:00:00 A 11 12
B 13 14
2012-01-01 12:00:00 A 11 12
B 13 14
2012-01-02 00:00:00 A 21 22
B 23 24
2012-01-02 12:00:00 A 21 22
B 23 24
2012-01-03 00:00:00 A 31 32
B 33 34
2012-01-03 12:00:00 A 31 32
B 33 34
使用您的代码:
>>> df.unstack().resample("12H").first().ffill().stack()
A B
2012-01-01 00:00:00 A 11.0 12.0
B 13.0 14.0
2012-01-01 12:00:00 A 11.0 12.0
B 13.0 14.0
2012-01-02 00:00:00 A 21.0 22.0
B 23.0 24.0
2012-01-02 12:00:00 A 21.0 22.0
B 23.0 24.0
2012-01-03 00:00:00 A 31.0 32.0
B 33.0 34.0
# <- Lost 2012-01-03 12:00:00
X 上的性能
>>> %timeit op_resample()
9.1 s ± 568 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> %timeit new_array()
1.86 s ± 23 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)