0

这是我的输入:

ClientData = {
'ClientName1': {
           'aggregate_PageviewsByWeek': [('2013-01-06', 2),
                                       ('2013-02-03', 1),
                                       ('2013-02-10', 1),
                                       ('2013-02-24', 1),
                                       ('2013-03-03', 2),

           'aggregate_Pageviews_VisitsByWeek': [],
           'aggregate_VisitsByWeek': [('2013-03-03', 1),
                                         ('2013-05-12', 1)]

                                       },


'ClientName2': {
           'aggregate_PageviewsByWeek': [('2013-01-06', 2),
                                       ('2013-02-03', 1),
                                       ('2013-02-10', 1),
                                       ('2013-02-24', 1),
                                       ('2013-03-03', 2),
                                       ('2013-03-24', 1),
      ],
           'aggregate_Pageviews_VisitsByWeek': [],
           'aggregate_VisitsByWeek': [('2013-03-03', 1),
                                      ('2013-03-31', 1),
                                      ('2013-05-12', 1),
                                      ('2013-05-19', 2),
                                      ('2013-06-30', 2)]
                                       }

}

如何根据日期键将 'aggregate_PageviewsByWeek' 和 'aggregate_VisitsByWeek' 的并集附加到键 'aggregate_Pageviews_VisitsByWeek'

输出将类似于以下内容:

{
'ClientName1': {
           'aggregate_PageviewsByWeek': [('2013-01-06', 2),
                                       ('2013-02-03', 1),
                                       ('2013-02-10', 1),
                                       ('2013-02-24', 1),
                                       ('2013-03-03', 2)],

           'aggregate_Pageviews_VisitsByWeek': [

                                               ('2013-01-06', 2, 0),
                                               ('2013-02-03', 1, 0),
                                               ('2013-02-10', 1, ),
                                               ('2013-02-24', 1, 0),
                                               ('2013-03-03', 2, 1),
                                               ('2013-05-12', 0, 1)],
           'aggregate_VisitsByWeek': [('2013-03-03', 1),
                                         ('2013-05-12', 1)]

                                       },



'ClientName2': {
           'aggregate_PageviewsByWeek': [('2013-01-06', 2),
                                       ('2013-02-03', 1),
                                       ('2013-02-10', 1),
                                       ('2013-02-24', 1),
                                       ('2013-03-03', 2),
                                       ('2013-03-24', 1)],

           'aggregate_Pageviews_VisitsByWeek': [
                                       ('2013-01-06', 2, 0),
                                       ('2013-02-03', 1, 0),
                                       ('2013-02-10', 1, 0),
                                       ('2013-02-24', 1, 0),
                                       ('2013-03-03', 2, 1),
                                       ('2013-03-31', 1, 1),
                                       ('2013-05-12', 0, 1),
                                       ('2013-05-19', 0, 2),
                                       ('2013-06-30', 0, 2)],

           'aggregate_VisitsByWeek': [('2013-03-03', 1),
                                      ('2013-03-31', 1),
                                      ('2013-05-12', 1),
                                      ('2013-05-19', 2),
                                      ('2013-06-30', 2)]
                                       }

}

如果“在这种情况下的日期”键不在另一个列表中,我想将该值替换为 0 (Date, aggregate_PageviewsByWeek_Value, aggregate_VisitsByWeek_Value )

示例:
aggregate_PageviewsByWeek :('2013-01-06', 12)和 aggregate_VisitsByWeek :(2013-01-13, 30)

输出将是:
aggregate_Pageviews_VisitsByWeek :[('2013-01-06', 12, 0), (2013-01-13, 0, 30)]

我这个问题的目标是根据日期获取页面浏览量和访问量的趋势。

谢谢!

4

2 回答 2

2

首先,您需要一个合并单个客户条目的函数。

有两种简单的方法可以合并可能每个都缺少某些值的并行序列:您可以并行迭代两者,或者您可以构建键的字典(或排序映射),并单独处理每个序列。您可以看到第一个示例,例如,here。但第二个更简单,至少在 Python 中,只要键是可散列的。所以:

def merge_client(client):
    merged = {}
    for day, views in client['aggregate_PageviewsByWeek']:
        merged[day] = [views, 0]
    for day, visits in client['aggregate_VisitsByWeek']:
        merged.setdefault(day, [0, 0])[1] = visits
    flattened = [tuple([key] + value) for key, value in merged.items()]
    client['aggregate_Pageviews_VisitsByWeek'] = sorted(flattened)

要使该算法包含两个以上的条目,您可以使用append- 或者,如果可能有大量的整数,则只需使用 dict 而不是列表(因此我们不必填写所有默认的 0,直到结束)。

现在我们只需要在列表中的每个客户端上调用它:

for client in ClientData.values():
    merge_client(client)
于 2013-09-18T19:48:28.830 回答
1

将每个列表转换为 dict,组合这些 dicts 的键,循环键并生成另一个列表,其中每个元素是日期,来自第一个 dict 或 0 的值和来自第二个 dict 或 0 的值,通过代码更好地解释:)

def merge_lists(list1, list2):
    dict1 = dict(list1)
    dict2 = dict(list2)
    dates = list(set(dict1.keys())|set(dict2.keys()))
    dates.sort()
    merged_list = []
    for date in dates:
        item = [date]
        item.append(dict1.get(date,0))
        item.append(dict2.get(date,0))
        merged_list.append(item)

    return merged_list

merged_list = merge_lists([('2013-01-06', 2),
            ('2013-02-03', 1),
            ('2013-02-10', 1),
            ('2013-02-24', 1),
            ('2013-03-03', 2),
            ('2013-03-24', 1)],
            [('2013-03-03', 1),
            ('2013-03-31', 1),
            ('2013-05-12', 1),
            ('2013-05-19', 2),
            ('2013-06-30', 2)])


import pprint
pprint.pprint(merged_list)

输出:

[['2013-01-06', 2, 0],
 ['2013-02-03', 1, 0],
 ['2013-02-10', 1, 0],
 ['2013-02-24', 1, 0],
 ['2013-03-03', 2, 1],
 ['2013-03-24', 1, 0],
 ['2013-03-31', 0, 1],
 ['2013-05-12', 0, 1],
 ['2013-05-19', 0, 2],
 ['2013-06-30', 0, 2]]

您可以通过合并任意数量的列表使其通用

def merge_lists(*lists):
    dicts = [dict(l) for l in lists]
    dates = set()
    for d in dicts:
        dates |= set(d.keys())
    dates = list(dates)
    dates.sort()
    merged_list = []
    for date in dates:
        item = [date]
        for d in dicts:
            item.append(d.get(date,0))
        merged_list.append(item)

    return merged_list
于 2013-09-18T19:39:08.307 回答