0

考虑下面的代码:

 sub = [767220, 769287, 770167, 770276, 770791, 770835, 771926, 1196500, 1199789,      1201485, 1206331, 1206467, 1210929, 1213184, 1213204, 1213221, 1361867, 1361921, 1361949, 1364886, 1367224, 1368005, 1368456, 1368982, 1369000, 1370365, 1370434, 1370551, 1371492, 1471407, 1709408, 1710264, 1710308, 1710322, 1710350, 1710365, 1710375]
 avg = []; final = [] 

 def runningMean(seq, n=0, total=0): #function called recursively
       if not seq:
         return []
       total =total+int(seq[-1])
       return runningMean(seq[:-1], n=n+1, total=total) + [total/float(n+1)]

 def main():

    avg = runningMean(sub,n = 0,total = 0) #function call to obtain running mean starting from last element in the list i,e 1710375
    print avg
    for i in range(len(sub)):
      if (int(sub[i]) > float(avg[i] * 0.9)): #checking the condition
         final.append(sub[i])
    print final


 if __name__ == '__main__':
       main()

输出由 runningmean 列表和子列表不满足条件组成:

  [1282960.6216216215, 1297286.75, 1312372.4571428571, 1328319.6764705882, 1345230.0909090908, 1363181.3125, 1382289.2580645161, 1402634.7, 1409742.7931034483, 1417241.142857143, 1425232.111111111, 1433651.3846153845, 1442738.76, 1452397.5, 1462798.0869565217, 1474143.2727272727, 1486568.142857143, 1492803.2, 1499691.7368421052, 1507344.111111111, 1515724.0, 1525005.25, 1535471.9333333333, 1547401.642857143, 1561126.2307692308, 1577136.75, 1595934.1818181819, 1618484.2, 1646032.3333333333, 1680349.875, 1710198.857142857, 1710330.6666666667, 1710344.0, 1710353.0, 1710363.3333333333, 1710370.0, 1710375.0]

  [1361867, 1361921, 1361949, 1364886, 1367224, 1709408, 1710264, 1710308, 1710322, 1710350, 1710365, 1710375]

我需要做的是,一旦条件失败,它应该停止寻找运行平均值

(sub[i] > float(avg[i] * 0.9))

i,e 结果应该是:

  [1680349.875, 1710198.857142857, 1710330.6666666667, 1710344.0, 1710353.0, 1710363.3333333333, 1710370.0, 1710375.0]
  [1709408, 1710264, 1710308, 1710322, 1710350, 1710365, 1710375]

如果有人可以为此建议python中的解决方案,那将很有帮助。

4

3 回答 3

1

我建议将您的平均计算器重新实现为生成器。生成器只计算它需要产生的下一个值,因为它正在被迭代。如果您提前停止迭代,其余的计算将不会完成。

此外,将代码设计为向前迭代比向后迭代要容易得多。如果您需要倒退,请使用该reversed函数获取反向迭代器,或reverse在列表中调用该方法。

这是一个计算累积平均值的生成器(向前,而不是向后):

def runningMean(iterable):
    """A generator, yielding a cumulative average of its input."""
    num = 0
    denom = 0
    for x in iterable:
        num += x
        denom += 1
        yield num / denom

要获得所需的反向累积平均值,您需要在reversed原始数据的迭代器上使用它:

>>> sub = [767220, 769287, 770167, 770276, 770791, 770835, 771926, 1196500, 1199789,      1201485, 1206331, 1206467, 1210929, 1213184, 1213204, 1213221, 1361867, 1361921, 1361949, 1364886, 1367224, 1368005, 1368456, 1368982, 1369000, 1370365, 1370434, 1370551, 1371492, 1471407, 1709408, 1710264, 1710308, 1710322, 1710350, 1710365, 1710375]
>>> list(runningMean(reversed(sub)))
[1710375.0, 1710370.0, 1710363.3333333333, 1710353.0, 1710344.0, 1710330.6666666667, 1710198.857142857, 1680349.875, 1646032.3333333333, 1618484.2, 1595934.1818181819, 1577136.75, 1561126.2307692308, 1547401.642857143, 1535471.9333333333, 1525005.25, 1515724.0, 1507344.111111111, 1499691.7368421052, 1492803.2, 1486568.142857143, 1474143.2727272727, 1462798.0869565217, 1452397.5, 1442738.76, 1433651.3846153845, 1425232.111111111, 1417241.142857143, 1409742.7931034483, 1402634.7, 1382289.2580645161, 1363181.3125, 1345230.0909090908, 1328319.6764705882, 1312372.4571428571, 1297286.75, 1282960.6216216215]

如果您想以与原始输入相同的顺序查看它,您可以使用该list.reverse()方法将其反转,但如果您想提前停止计算,我认为您需要让它倒退一段时间。

要在发现大于累积平均值 10% 以上的值时停止,可以使用itertools.takewhile

import itertools

results = list(itertools.takewhile(lambda x: x[0] > 0.9 * x[1],
                                   itertools.izip(reversed(sub),
                                                  runningMean(reversed(sub)))))

在 Python 3 中,使用常规zip内置函数,而不是itertools.izip.

这为您提供了满足您条件的值和平均值的列表,从末尾开始,在第一个未通过测试的值之前停止。您可以通过以下方式查看它们:

results.reverse() # put them back in regular order
for value, average in results:
    print value, results

输出:

1709408 1710198.857142857
1710264 1710330.6666666667
1710308 1710344.0
1710322 1710353.0
1710350 1710363.3333333333
1710365 1710370.0
1710375 1710375.0
于 2013-08-21T05:40:26.247 回答
0
sub = [767220, 769287, 770167, 770276, 770791, 770835, 771926, 1196500, 1199789,      1201485, 1206331, 1206467, 1210929, 1213184, 1213204, 1213221, 1361867, 1361921, 1361949, 1364886, 1367224, 1368005, 1368456, 1368982, 1369000, 1370365, 1370434, 1370551, 1371492, 1471407, 1709408, 1710264, 1710308, 1710322, 1710350, 1710365, 1710375]

def runningMean(seq, n=0, total=0): #function called recursively
    if not seq:
        return []
    total = total + int(seq[-1])
    if int(seq[-1]) < total/float(n+1) * 0.9:  # Check your condition to see if it's time to stop averaging.
        return []
    return runningMean(seq[:-1], n=n+1, total=total) + [total/float(n+1)]

avg = runningMean(sub, n = 0, total = 0)

print avg
print sub[-len(avg):]
于 2013-08-21T04:38:34.560 回答
0

为了获得预期的运行平均值,我运行了:

sub.reverse()
avg = runningMean(sub,n = 0,total = 0) #function call to obtain running mean starting from last element in the list i,e 1710375
print avg

下一个比较部分不清楚。你能用文字描述一下这个算法吗?

于 2013-08-21T04:58:44.307 回答