我需要以我自己的方式实现移动平均:输入仅包括来自非零值的样本,但输出应该为每个时间刻度计算,也适用于空的,那些不在输入中的。
代码示例:
time_step = 120
window_size = time_step * 30
ma_array = []
def my_rolling_mean():
window_start_iter = extent_df.itertuples()
window_end_iter = extent_df.itertuples()
window_start_tuple = window_start_iter.next()
window_end_tuple = None
next_window_end_tuple = window_end_iter.next()
rolling_sum = 0
for t_i_start in xrange(start_log_time, end_log_time - window_size, time_step):
t_i_end = t_i_start + window_size
while window_start_tuple[0][0] < t_i_start: # time
rolling_sum -= real_start_tuple[1] # value
window_start_tuple = df_start_iter.next()
while next_window_end_tuple[0][0] < t_i_end:
window_end_tuple = next_window_end_tuple
next_window_end_tuple = window_end_iter.next()
rolling_sum += window_end_tuple[1]
ma_i = float(rolling_sum) / ((t_i_end - t_i_start) / time_step)
ma_array.append(ma_i)
*pandas.rolling_mean* 100 的时间表现优于 *my_rolling_mean*:
In [342]: extent_df[:10]
Out[342]:
TOTAL_RR
TIME EXTENT
120 0 10
240 0 20
360 0 30
480 0 40
600 0 50
720 0 60
840 0 87
960 0 87
1080 0 87
1200 0 87
In [343]: len(extent_df)
Out[343]: 9110
In [344]: %timeit my_rolling_mean()
10 loops, best of 3: 26.3 ms per loop
In [345]: %timeit pd.rolling_mean(extent_df, 3600)
1000 loops, best of 3: 232 µs per loop
请告知如何提高性能。
提前谢谢你,
斯拉瓦