给定一个元组列表,[(x1, y1), (x2, y2) ... (xm, ym)]
例如[(1, 2), (3, 7), (5, 9)]
我想编写一个函数,用相邻值 f(x - 1)、f(x + 1) 的平均值填充缺失的整数值 x。
在这种情况下,我们会得到:
[(1, 2), (2, ave(2, 7)), (3, 7), (4, ave(7, 9)), (5, 9)]
import numpy as np
# calculating nearest neighbor averages
def nearest(x, y):
# define the min and max for our line
min = np.amin(x)
max = np.amax(x)
# fill in the gaps
numsteps = max - min + 1
# an empty vessel
new_df = []
# an empty vessel for our xs
xs = np.linspace(min, max, numsteps)
for i, item in enumerate(xs):
if(xs[i] in x):
idx = x.index(xs[i])
new_df.insert(i, (xs[i], y[idx]))
else:
idx = x.index(xs[i] - 1)
idx2 = x.index(xs[i] + 1)
avg = (y[idx] + y[idx2])/2.0
new_df.insert(i, (xs[i], avg))
print new_df
nearest([1, 3, 5], [6, 7, 8])
// [(1.0, 6), (2.0, 6.5), (3.0, 7), (4.0, 7.5), (5.0, 8)]
但是,对于诸如 的数组,这很快就会失败,xs = [1, 4, 7]
因为这些值彼此相距不止一个。在这种情况下,给定相同的ys = [2, 7, 9]
,我们期望答案是:
[(1, 2), (2, ave(2, 7)), (3, ave(2,7)), (4, 7) ... ]
或者
有点复杂的东西:
[(1, 2), (2, ave(prev, next_that_exists)), (3, ave(just_created, next_that exists), ...]
我怎样才能实现,以便我们找到刚好低于缺失元素和高于缺失元素的元素,并计算它们的平均值?
另外,这与移动平均线不同吗?