2

我想计算时间增量不规则的数据集的 10 秒差异。数据存在于 2 个长度相等的 1-D 数组中,一个用于时间,另一个用于数据值。

经过一番摸索,我能够想出一个解决方案,但是基于(我怀疑)必须遍历数组中的每个项目,它太慢了。

我的一般方法是遍历时间数组,并且对于每个时间值,我找到早 x 秒的时间值的索引。然后我使用数据数组上的这些索引来计算差异。

代码如下所示。

一、find_closest来自Bi Rico的函数

def find_closest(A, target):
    #A must be sorted
    idx = A.searchsorted(target)
    idx = np.clip(idx, 1, len(A)-1)
    left = A[idx-1]
    right = A[idx]
    idx -= target - left < right - target
    return idx

然后我以下列方式使用它

def trailing_diff(time_array,data_array,seconds):
    trailing_list=[]
    for i in xrange(len(time_array)):
        now=time_array[i]
        if now<seconds:
            trailing_list.append(0)
        else:
            then=find_closest(time_array,now-seconds)
            trailing_list.append(data_array[i]-data_array[then])
    return np.asarray(trailing_list)

不幸的是,这并不能很好地扩展,我希望能够即时计算(并绘制它)。

关于如何使它更方便的任何想法?

编辑:输入/输出

In [48]:time1
Out[48]:
array([  0.57200003,   0.579     ,   0.58800006,   0.59500003,
         0.5999999 ,   1.05999994,   1.55900002,   2.00900006,
         2.57599998,   3.05599999,   3.52399993,   4.00699997,
         4.09599996,   4.57299995,   5.04699993,   5.52099991,
         6.09299994,   6.55999994,   7.04099989,   7.50900006,
         8.07500005,   8.55799985,   9.023     ,   9.50699997,
         9.59399986,  10.07200003,  10.54200006,  11.01999998,
        11.58899999,  12.05699992,  12.53799987,  13.00499988,
        13.57599998,  14.05599999,  14.52399993,  15.00199985,
        15.09299994,  15.57599998,  16.04399991,  16.52199984,
        17.08899999,  17.55799985,  18.03699994,  18.50499988,
        19.0769999 ,  19.5539999 ,  20.023     ,  20.50099993,
        20.59099984,  21.07399988])

In [49]:weight1
Out[49]:
array([ 82.268,  82.268,  82.269,  82.272,  82.275,  82.291,  82.289,
        82.288,  82.287,  82.287,  82.293,  82.303,  82.303,  82.314,
        82.321,  82.333,  82.356,  82.368,  82.386,  82.398,  82.411,
        82.417,  82.419,  82.424,  82.424,  82.437,  82.45 ,  82.472,
        82.498,  82.515,  82.541,  82.559,  82.584,  82.607,  82.617,
        82.626,  82.626,  82.629,  82.63 ,  82.636,  82.651,  82.663,
        82.686,  82.703,  82.728,  82.755,  82.773,  82.8  ,  82.8  ,
        82.826])

In [50]:trailing_diff(time1,weight1,10)
Out[50]:
array([ 0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.   ,
        0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.   ,
        0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.   ,
        0.   ,  0.169,  0.182,  0.181,  0.209,  0.227,  0.254,  0.272,
        0.291,  0.304,  0.303,  0.305,  0.305,  0.296,  0.274,  0.268,
        0.265,  0.265,  0.275,  0.286,  0.309,  0.331,  0.336,  0.35 ,
        0.35 ,  0.354])
4

1 回答 1

1

使用现成的插值程序。如果你真的想要最近邻行为,我认为它必须是 scipy's scipy.interpolate.interp1d,但线性插值似乎是一个更好的选择,然后你可以使用 numpy's numpy.interp

def trailing_diff(time, data, diff):
    ret = np.zeros_like(data)
    mask = (time - time[0]) >= diff
    ret[mask] = data[mask] - np.interp(time[mask] - diff,
                                       time, data)
    return ret

time = np.arange(10) + np.random.rand(10)/2
weight = 82 + np.random.rand(10)

>>> time
array([ 0.05920317,  1.23000929,  2.36399981,  3.14701595,  4.05128494,
        5.22100886,  6.07415922,  7.36161563,  8.37067107,  9.11371986])
>>> weight
array([ 82.14004969,  82.36214992,  82.25663272,  82.33764514,
        82.52985723,  82.67820915,  82.43440796,  82.74038368,
        82.84235675,  82.1333915 ])
>>> trailing_diff(time, weight, 3)
array([ 0.        ,  0.        ,  0.        ,  0.18093749,  0.20161107,
        0.4082712 ,  0.10430073,  0.17116831,  0.20691594, -0.31041841])

要获得最近的邻居,您会这样做

from scipy.interpolate import interp1d

def trailing_diff(time, data, diff):
    ret = np.zeros_like(data)
    mask = (time - time[0]) >= diff
    interpolator = interp1d(time, data, kind='nearest')
    ret[mask] = data[mask] - interpolator(time[mask] - diff)
    return ret
于 2013-08-23T03:59:35.563 回答