3

我有一个这个数据,我不会在给定的两个日期期间按周聚合它:我确实有这个元组列表,它从我的交易表中显示每天的销售额

Metadate = [(Day, 'totalSales By dollars')]

--

data=[('2013-06-21', 14),
 ('2013-06-20', 19),
 ('2013-06-23', 11),
 ('2013-06-22', 13),
 ('2013-06-25', 6),
 ('2013-06-24', 22),
 ('2013-06-27', 22),
 ('2013-06-26', 3),
 ('2013-06-29', 10),
 ('2013-06-28', 15),
 ('2013-04-05', 20),
 ('2013-04-04', 33),
 ('2013-04-07', 20),
 ('2013-04-06', 15),
 ('2013-04-01', 23),
 ('2013-04-03', 5),
 ('2013-04-02', 8),
 ('2013-04-09', 10),
 ('2013-04-08', 24),
 ('2013-05-08', 1),
 ('2013-05-09', 29),
 ('2013-05-02', 17),
 ('2013-05-03', 18),
 ('2013-05-01', 4),
 ('2013-05-06', 16),
 ('2013-05-07', 10),
 ('2013-05-04', 9),
 ('2013-05-05', 12),
 ('2013-05-19', 21),
 ('2013-05-18', 26),
 ('2013-05-11', 8),
 ('2013-05-10', 12),
 ('2013-05-13', 24),
 ('2013-05-12', 9),
 ('2013-05-15', 5),
 ('2013-05-14', 7),
 ('2013-05-17', 20),
 ('2013-05-16', 36),
 ('2013-05-20', 24),
 ('2013-05-21', 5),
 ('2013-05-22', 3),
 ('2013-05-23', 18),
 ('2013-05-24', 8),
 ('2013-05-25', 11),
 ('2013-05-26', 9),
 ('2013-05-27', 13),
 ('2013-05-28', 4),
 ('2013-05-29', 7),
 ('2013-06-18', 9),
 ('2013-06-19', 2),
 ('2013-06-10', 20),
 ('2013-06-11', 4),
 ('2013-06-12', 3),
 ('2013-06-13', 25),
 ('2013-06-14', 16),
 ('2013-06-15', 10),
 ('2013-06-16', 11),
 ('2013-06-17', 17),
 ('2013-04-30', 12),
 ('2013-05-31', 13),
 ('2013-05-30', 29),
 ('2013-06-09', 12),
 ('2013-06-08', 20),
 ('2013-06-07', 47),
 ('2013-06-06', 5),
 ('2013-06-05', 2),
 ('2013-06-04', 3),
 ('2013-06-03', 32),
 ('2013-06-02', 13),
 ('2013-06-01', 9),
 ('2013-04-23', 3),
 ('2013-04-22', 33),
 ('2013-04-21', 14),
 ('2013-04-20', 20),
 ('2013-04-27', 15),
 ('2013-04-26', 17),
 ('2013-04-25', 21),
 ('2013-04-24', 1),
 ('2013-04-29', 34),
 ('2013-04-28', 11),
 ('2013-06-30', 13),
 ('2013-04-16', 5),
 ('2013-04-17', 3),
 ('2013-04-14', 10),
 ('2013-04-15', 22),
 ('2013-04-12', 23),
 ('2013-04-13', 19),
 ('2013-04-10', 1),
 ('2013-04-11', 31),
 ('2013-04-18', 27),
 ('2013-04-19', 14)]

我想在给定开始和结束日期的两个日期按周聚合它:
输出示例:

[(2013-05-07, 900), [(2013-05-14, 1800),....., (2013-08-01, 1000)]

提前致谢!

4

2 回答 2

4

尝试这个:

from datetime import datetime
import itertools

sales = [
    ('2013-05-01', 100),
    ('2013-05-02', 200),
    ('2013-05-03', 150),
    ('2013-05-03', 120),
    ('2013-05-04', 200),
    ('2013-08-01', 250),
]

def toWeek(sale):
    '''(date,volume) -> date of the Sunday of that week'''
    sunday = datetime.strptime(sale[0], '%Y-%m-%d').strftime('%Y-%U-0')
    return datetime.strptime(sunday, '%Y-%U-%w').strftime('%Y-%m-%d')

grouped_sales = itertools.groupby(sales, toWeek)

aggregate_sales = (
    (week, sum(day_sales for date, day_sales in week_sales))
    for week, week_sales in grouped_sales)

print list(aggregate_sales)
于 2013-09-14T04:06:44.890 回答
1

您还可以使用ISO 8601 周计算双端队列来执行此操作:

import datetime
from collections import deque

data=[
    ('2013-05-01', 100),
    ('2013-05-02', 200),
    ('2013-05-03', 150),
    ('2013-05-03', 120),
    ('2013-05-04', 200),
    ('2013-08-01', 250),
]
fmt='%Y-%m-%d'
data=deque((datetime.datetime.strptime(x,fmt),y) 
              for x,y in sorted(data,key=lambda t:t[0]))
grouped={}
while data:
    week=data[0][0]-datetime.timedelta(days=(data[0][0].isocalendar()[2]-1))
    temp=[data.popleft()]
    while data and week.isocalendar()[:2]==data[0][0].isocalendar()[:2]:
        temp.append(data.popleft())
    
    key='ISO week {}-W{:02}'.format(*week.isocalendar()[:2]) 
    key+=' ({} - {})'.format(week.strftime(fmt),
                        (week+datetime.timedelta(days=6)).strftime(fmt))     

    grouped[key]=sum(t[1] for t in temp)
    
for k in sorted(grouped):
    print '{}: {}'.format(k, grouped[k])

印刷:

ISO week 2013-W18 (2013-04-29 - 2013-05-05): 770
ISO week 2013-W31 (2013-07-29 - 2013-08-04): 250

编辑

我看到了您发布的代码,您可以使用另一个双端队列进行简化:

D=deque()
fmt='%Y-%m-%d'
data=deque((datetime.datetime.strptime(x,fmt),y) 
              for x,y in sorted(data,key=lambda t:t[0]))
while data:
    week=data[0][0]-datetime.timedelta(days=(data[0][0].isocalendar()[2]-1))
    temp=[data.popleft()]
    while data and week.isocalendar()[:2]==data[0][0].isocalendar()[:2]:
        temp.append(data.popleft())
    
    D.appendleft((week.strftime(fmt), sum(t[1] for t in temp)))

print list(D)

印刷

[('2013-06-24', 91), ('2013-06-17', 85), ('2013-06-10', 89), ('2013-06-03', 121), ('2013-05-27', 88), ('2013-05-20', 78), ('2013-05-13', 139), ('2013-05-06', 85), ('2013-04-29', 106), ('2013-04-22', 101), ('2013-04-15', 105), ('2013-04-08', 118), ('2013-04-01', 124)]
于 2013-09-14T17:02:46.887 回答