0

我是一个新的 python/pandas 用户。我正在尝试获取时间序列数据帧的动态范围(基于值的差异)之间的时间增量(以秒为单位)。我的示例数据框是:

time                          price
2013-04-26 09:30:03-04:00       101
2013-04-26 09:30:04-04:00       101
2013-04-26 09:30:05-04:00       102
2013-04-26 09:30:06-04:00       105
2013-04-26 09:30:07-04:00       104
2013-04-26 09:30:08-04:00       105
2013-04-26 09:30:09-04:00       106
2013-04-26 09:30:10-04:00       104
2013-04-26 09:30:11-04:00       110
2013-04-26 09:30:12-04:00       109
2013-04-26 09:30:13-04:00       111
2013-04-26 09:30:14-04:00       108
2013-04-26 09:30:15-04:00       106
2013-04-26 09:30:16-04:00       107
2013-04-26 09:30:17-04:00       107
2013-04-26 09:30:18-04:00       108
2013-04-26 09:30:19-04:00       109
2013-04-26 09:30:20-04:00       109
2013-04-26 09:30:21-04:00       110

我试图获得价格差异为 4 之间的时间增量。一旦达到价格差异,该价格点将成为下一次计算的“起点”,依此类推。所需的结果类似于(以秒为单位的时间增量):

time                       price    time delta
2013-04-26 09:30:03-04:00   101 
2013-04-26 09:30:04-04:00   101 
2013-04-26 09:30:05-04:00   102 
2013-04-26 09:30:06-04:00   105      3
2013-04-26 09:30:07-04:00   104 
2013-04-26 09:30:08-04:00   105 
2013-04-26 09:30:09-04:00   106 
2013-04-26 09:30:10-04:00   104 
2013-04-26 09:30:11-04:00   110      5
2013-04-26 09:30:12-04:00   109 
2013-04-26 09:30:13-04:00   111 
2013-04-26 09:30:14-04:00   108 
2013-04-26 09:30:15-04:00   106      4
2013-04-26 09:30:16-04:00   107 
2013-04-26 09:30:17-04:00   107 
2013-04-26 09:30:18-04:00   108 
2013-04-26 09:30:19-04:00   109 
2013-04-26 09:30:20-04:00   109 
2013-04-26 09:30:21-04:00   110      6
4

1 回答 1

1

不确定这在性能方面有多好。

import numpy as np
import pandas as pd

gen = df.price.iteritems()

def get_deltas(gen):
    time, value = next(gen)
    deltas = [np.nan]  # initial value
    for line in gen:
        if np.abs(line[1] - value) >= 4:
            deltas.append(np.abs(line[0] - time))
            time, value = line
        else:
            deltas.append(np.nan)    
    return deltas

deltas = get_deltas(df.price.iteritems())
df['deltas'] = deltas

In [58]: df
Out[58]: 
                     price   deltas
time                               
2013-04-26 13:30:03    101      NaN
2013-04-26 13:30:04    101      NaN
2013-04-26 13:30:05    102      NaN
2013-04-26 13:30:06    105  0:00:03
2013-04-26 13:30:07    104      NaN
2013-04-26 13:30:08    105      NaN
2013-04-26 13:30:09    106      NaN
2013-04-26 13:30:10    104      NaN
2013-04-26 13:30:11    110  0:00:05
2013-04-26 13:30:12    109      NaN
2013-04-26 13:30:13    111      NaN
2013-04-26 13:30:14    108      NaN
2013-04-26 13:30:15    106  0:00:04
2013-04-26 13:30:16    107      NaN
2013-04-26 13:30:17    107      NaN
2013-04-26 13:30:18    108      NaN
2013-04-26 13:30:19    109      NaN
2013-04-26 13:30:20    109      NaN
2013-04-26 13:30:21    110  0:00:06
于 2013-06-28T21:51:22.187 回答