python - 如果我想要空插值（分段常数），numpy.interp 的最佳替代品是什么？

Question

numpy.interp很方便，也比较快。在某些情况下，我想将其输出与传播稀疏值（在“更密集”输出中）的非插值变体进行比较，结果在稀疏输入之间是分段常数。我想要的函数也可以称为“稀疏 - > 密集”转换器，它复制最新的稀疏值，直到找到后面的值（一种空插值，就好像从早期的值过去了零时间/距离）。

不幸的是，调整源代码并不容易，numpy.interp因为它只是一个编译函数的包装器。我可以使用 Python 循环自己编写此代码，但希望找到一种 C 速度的方法来解决问题。

更新：下面的解决方案（scipy.interpolate.interp1dwith kind='zero'）非常慢，每次调用需要超过 10 秒（例如输入 500k 的长度，填充了 50%）。它kind='zero'使用零阶样条实现，调用spleval速度非常慢。但是，（即默认插值）的源代码为kind='linear'使用直接 numpy 解决问题提供了一个很好的模板（最小的变化是 set slope=0）。该代码显示了如何使用numpy.searchsorted来解决问题，并且运行时类似于调用numpy.interp，因此通过调整scipy.interpolate.interp1d线性插值的实现以跳过插值步骤（斜率！= 0 混合相邻值）来解决问题。

score 4 · Accepted Answer

可以做各种scipy.interpolate.interp1d插值：'linear'，'nearest'，'zero'，'slinear'，'quadratic，'cubic'。

请查看文档：http ://docs.scipy.org/doc/scipy-0.10.1/reference/generated/scipy.interpolate.interp1d.html#scipy.interpolate.interp1d

score 2 · Accepted Answer

只是为了完成：问题的解决方案是以下代码，我可以在更新答案中给出的提示的帮助下编写该代码：

def interpolate_constant(x, xp, yp):
    indices = np.searchsorted(xp, x, side='right')
    y = np.concatenate(([0], yp))
    return y[indices]

score 0 · Accepted Answer

我完全同意 kind='zero' 非常慢；对于数百万行的大型数据集，它可能比“线性”方法慢 1000 倍。对于“左常数”插值 - 使用最新值 - 以下代码有效：

def approx(x, y, xout, yleft=np.nan, yright=np.nan): 
    xoutIdx     = np.searchsorted(x, xout, side='right')-1
    return (np.where(xout<x[0], yleft, np.where(xout>x[-1], yright, y[xoutIdx])))

来自 R 背景，这相当于 f=0 时的 R 近似值。我还没有找到一种干净的方法来执行“右常数”插值，因为如果 xout 值与 x 中的值完全匹配，python 的 np.searchsorted with side='right' 会将一个索引推回......

python - 如果我想要空插值（分段常数），numpy.interp 的最佳替代品是什么？

3 回答 3

Related

Reference