22

我想计算列表中日期之间的平均时间增量。尽管以下方法效果很好,但我想知道是否有更聪明的方法?

delta = lambda last, next: (next - last).seconds + (next - last).days * 86400   
total = sum(delta(items[i-1], items[i]) for i in range(1, len(items)))
average = total / (len(items) - 1)
4

4 回答 4

65

顺便说一句,如果你有一个 timedeltas 或 datetimes 列表,你为什么还要自己做数学?

datetimes = [ ... ]

# subtracting datetimes gives timedeltas
timedeltas = [datetimes[i-1]-datetimes[i] for i in range(1, len(datetimes))]

# giving datetime.timedelta(0) as the start value makes sum work on tds 
average_timedelta = sum(timedeltas, datetime.timedelta(0)) / len(timedeltas)
于 2010-09-01T11:32:38.687 回答
3

试试这个:

from itertools import izip

def average(items):   
    total = sum((next - last).seconds + (next - last).days * 86400
                for next, last in izip(items[1:], items))
     return total / (len(items) - 1)

在我看来,这样做更具可读性。对您的代码的数学倾向较低的读者的评论可能有助于解释您如何计算每个增量。对于它的价值,一个生成器表达式的操作码指令是我看过的任何东西中最少的(我认为是最慢的)。

  # The way in your question compiles to....
  3           0 LOAD_CONST               1 (<code object <lambda> at 0xb7760ec0, file 

"scratch.py", line 3>)
              3 MAKE_FUNCTION            0
              6 STORE_DEREF              1 (delta)

  4           9 LOAD_GLOBAL              0 (sum)
             12 LOAD_CLOSURE             0 (items)
             15 LOAD_CLOSURE             1 (delta)
             18 BUILD_TUPLE              2
             21 LOAD_CONST               2 (<code object <genexpr> at 0xb77c0a40, file "scratch.py", line 4>)
             24 MAKE_CLOSURE             0
             27 LOAD_GLOBAL              1 (range)
             30 LOAD_CONST               3 (1)
             33 LOAD_GLOBAL              2 (len)
             36 LOAD_DEREF               0 (items)
             39 CALL_FUNCTION            1
             42 CALL_FUNCTION            2
             45 GET_ITER            
             46 CALL_FUNCTION            1
             49 CALL_FUNCTION            1
             52 STORE_FAST               1 (total)

  5          55 LOAD_FAST                1 (total)
             58 LOAD_GLOBAL              2 (len)
             61 LOAD_DEREF               0 (items)
             64 CALL_FUNCTION            1
             67 LOAD_CONST               3 (1)
             70 BINARY_SUBTRACT     
             71 BINARY_DIVIDE       
             72 STORE_FAST               2 (average)
             75 LOAD_CONST               0 (None)
             78 RETURN_VALUE        
None
#
#doing it with just one generator expression and itertools...

  4           0 LOAD_GLOBAL              0 (sum)
              3 LOAD_CONST               1 (<code object <genexpr> at 0xb777eec0, file "scratch.py", line 4>)
              6 MAKE_FUNCTION            0

  5           9 LOAD_GLOBAL              1 (izip)
             12 LOAD_FAST                0 (items)
             15 LOAD_CONST               2 (1)
             18 SLICE+1             
             19 LOAD_FAST                0 (items)
             22 CALL_FUNCTION            2
             25 GET_ITER            
             26 CALL_FUNCTION            1
             29 CALL_FUNCTION            1
             32 STORE_FAST               1 (total)

  6          35 LOAD_FAST                1 (total)
             38 LOAD_GLOBAL              2 (len)
             41 LOAD_FAST                0 (items)
             44 CALL_FUNCTION            1
             47 LOAD_CONST               2 (1)
             50 BINARY_SUBTRACT     
             51 BINARY_DIVIDE       
             52 RETURN_VALUE        
None

特别是,删除 lambda 可以让我们避免创建一个闭包、构建一个元组和加载两个闭包。无论哪种方式,都会调用五个函数。当然,这种对性能的关注有点荒谬,但很高兴知道引擎盖下发生了什么。最重要的是可读性,我认为这样做也能获得很高的分数。

于 2010-09-01T10:57:27.003 回答
0
sum(timedelta_list ,datetime.timedelta())
于 2020-12-11T01:35:29.713 回答
0

如果您有时间增量列表:

import pandas as pd
avg=pd.to_timedelta(pd.Series(yourtimedeltalist)).mean()
于 2021-12-03T10:14:25.340 回答