0

我需要以我自己的方式实现移动平均:输入仅包括来自非零值的样本,但输出应该为每个时间刻度计算,也适用于空的,那些不在输入中的。

代码示例:

time_step = 120    
window_size = time_step * 30
ma_array = []

def my_rolling_mean():
    window_start_iter = extent_df.itertuples()
    window_end_iter = extent_df.itertuples()
    window_start_tuple = window_start_iter.next()
    window_end_tuple = None
    next_window_end_tuple = window_end_iter.next()
    rolling_sum = 0

    for t_i_start in xrange(start_log_time, end_log_time - window_size, time_step):
        t_i_end = t_i_start + window_size

        while window_start_tuple[0][0] < t_i_start:  # time
            rolling_sum -= real_start_tuple[1]  # value
            window_start_tuple = df_start_iter.next()

        while next_window_end_tuple[0][0] < t_i_end:
            window_end_tuple = next_window_end_tuple
            next_window_end_tuple = window_end_iter.next()
            rolling_sum += window_end_tuple[1]

        ma_i = float(rolling_sum) / ((t_i_end - t_i_start) / time_step)
        ma_array.append(ma_i)

*pandas.rolling_mean* 100 的时间表现优于 *my_rolling_mean*:

In [342]: extent_df[:10]
Out[342]: 
             TOTAL_RR
TIME EXTENT          
120  0             10
240  0             20
360  0             30
480  0             40
600  0             50
720  0             60
840  0             87
960  0             87
1080 0             87
1200 0             87

In [343]: len(extent_df)
Out[343]: 9110

In [344]: %timeit my_rolling_mean()
10 loops, best of 3: 26.3 ms per loop

In [345]: %timeit pd.rolling_mean(extent_df, 3600)
1000 loops, best of 3: 232 µs per loop

请告知如何提高性能。

提前谢谢你,
斯拉瓦

4

0 回答 0