python - 任意数量的带公共日期的元组解包

Question

输入

datas2 = [[("01/01/2011", 1), ("02/02/2011", "No"), ("03/03/2011", 11)],
[("01/01/2011", 2), ("03/03/2011", 22), ("22/22/2222", "no")],
[("01/01/2011", 3), ("03/03/2011", 33), ("22/22/2222", "333")]]

预期输出

[("01/01/2011", 1, 2, 3), ("03/03/2011", 11, 22, 33)]

[更新]

我被问及真实数据和更多示例（历史上的混乱代码）：

A                       B                       C
09.05.2011;1.561        12.04.2011;14.59        12.04.2011;1.5
10.05.2011;1.572        13.04.2011;14.50        13.04.2011;1.5    
11.05.2011;1.603        14.04.2011;14.56        14.04.2011;1.5    
12.05.2011;1.566        15.04.2011;14.54        15.04.2011;1.5    
13.05.2011;1.563        18.04.2011;14.54        18.04.2011;1.5    
16.05.2011;1.537        19.04.2011;14.52        19.04.2011;1.5    
17.05.2011;1.528        20.04.2011;14.53        20.04.2011;1.5    
18.05.2011;1.543        21.04.2011;14.59        21.04.2011;1.5    
19.05.2011;1.537        26.04.2011;14.65        26.04.2011;1.6    
20.05.2011;1.502        27.04.2011;14.68        27.04.2011;1.6    
23.05.2011;1.503        28.04.2011;14.66        28.04.2011;1.6    
24.05.2011;1.483        29.04.2011;14.62        29.04.2011;1.6    
25.05.2011;1.457        02.05.2011;14.65        02.05.2011;1.6    
26.05.2011;1.491        03.05.2011;14.63        03.05.2011;1.6    
27.05.2011;1.509        04.05.2011;14.54        04.05.2011;1.5    
30.05.2011;1.496        05.05.2011;14.57        05.05.2011;1.5    
31.05.2011;1.503        06.05.2011;14.57        06.05.2011;1.5    
01.06.2011;1.509        09.05.2011;14.61        09.05.2011;1.6    
03.06.2011;1.412        10.05.2011;14.66        10.05.2011;1.6    
06.06.2011;1.380        11.05.2011;14.71        11.05.2011;1.7    
07.06.2011;1.379        12.05.2011;14.71        12.05.2011;1.7    
08.06.2011;1.372        13.05.2011;14.70        13.05.2011;1.7    
09.06.2011;1.366        16.05.2011;14.75        16.05.2011;1.7    
10.06.2011;1.405        17.05.2011;14.69        17.05.2011;1.6    
13.06.2011;1.400        18.05.2011;14.65        18.05.2011;1.6    
14.06.2011;1.414        19.05.2011;14.69        19.05.2011;1.6

如果我解压缩 A 和 B，它将包含所有值。
如果我打开 A、B 和 C 的包装，它将包含：

[ [“09.05.2011”, 1.561, 14.61, 1.6], [“10.05.2011”, 1.572, 14.66, 1.6], [“11.05.2011”, 1.603, 14.71, 1.7], [“12.05.2011”, 1.566, 14.71, 1.7], ["13.05.2011", 1.563, 14.70, 1.7], ["16.05.2011", 1.537, 14.75, 1.7], ["17.05.2011", 1.528, 14.69, 1.6], [ "18.05.2011", 1.543, 14.65, 1.6], ["19.05.2011", 1.537, 14.69, 1.6] ]

所以每个日期必须有与文件一样多的值，即列 A、B、C、...

score 3 · Accepted Answer

from collections import defaultdict
import itertools

d = defaultdict(list)
for i,j in itertools.chain.from_iterable(datas2):
    if not isinstance(j, str):
        d[i].append(j)

并且d将是一个像这样的字典：

{'01/01/2011': [1, 2, 3], '03/03/2011': [11, 22, 33]}

因此，您可以稍后将其格式化为元组d.items()

请注意，“22/22/2222”未经过验证，但在for循环内很容易做到这一点。

score 2 · Accepted Answer

此代码编写为在 Python 2.x 或 Python 3.x 上同样有效。我使用 Python 2.7 和 Python 3.2 对其进行了测试。

from collections import defaultdict

datas2 = [
    [("01/01/2011", 1), ("02/02/2011", "No"), ("03/03/2011", 11)],
    [("01/01/2011", 2), ("03/03/2011", 22), ("22/22/2222", "no")],
    [("01/01/2011", 3), ("03/03/2011", 33), ("22/22/2222", "333")]
]


def want_value(val):
    """return true if val is a value we want to keep"""
    try:
        # detect numbers by trying to add to 0
        0 + val
        # no exception means it is a number and we want it
        return True
    except TypeError:
        # exception means it is the wrong type (a string or whatever)
        return False

result = defaultdict(list)

for lst in datas2:
    for date, val in lst:
        if want_value(val):
            result[date].append(val)

final_result = list(result.items())
print(final_result)

python - 任意数量的带公共日期的元组解包

2 回答 2

Related

Reference