python-3.x - 什么是另一种编写python3 zip的方法

Question

我一直在研究读取文件文档中的行然后代码组织它们的代码。但是，我一度陷入困境，我的朋友告诉我我可以使用什么。该代码有效，但似乎我不知道他在第 7 行和第 8 行从底部做什么。我使用了####，所以你们知道它是哪几行。

那么，本质上，您如何重新编写这两行代码以及它们为什么起作用？我似乎不明白来自 sys import argv 的字典

filename = input("Please enter the name of a file: ")
file_in=(open(filename, "r"))

print("Number of times each animal visited each station:")
print("Animal Id             Station 1              Station 2")

animaldictionary = dict()

for line in file_in:
    if '\n' == line[-1]:
        line = line[:-1]
    (a, b, c) = line.split(':')
    ac = (a,c)
    if ac not in animaldictionary:
        animaldictionary[ac] = 0
    animaldictionary[ac] += 1

alla = []
for key, value in animaldictionary:
    if key not in alla:
        alla.append(key)
print ("alla:",alla)
allc = []
for key, value in animaldictionary:
    if value not in allc:
        allc.append(value)    
print("allc", allc)

for a in sorted(alla):
    print('%9s'%a,end=' '*13)
    for c in sorted(allc):
        ac = (a,c)
        valc = 0
        if ac in animaldictionary:
            valc = animaldictionary[ac]
        print('%4d'%valc,end=' '*19)

    print()

print("="*60)
print("Animals that visited both stations at least 3 times: ")

for a in sorted(alla):
    x = 'false'
    for c in sorted(allc):
        ac = (a,c)
        count = 0
        if ac in animaldictionary:
            count = animaldictionary[ac]
            if count >= 3:
                x = 'true'
    if x is 'true':    
        print('%6s'%a, end=' ')
        print("")

print("="*60)
print("Average of the number visits in each month for each station:")

#(alla, allc) = 
#for s in zip(*animaldictionary.keys()):
#    (alla,allc).append(s)
#print(alla, allc)

(alla,allc,) = (set(s) for s in zip(*animaldictionary.keys())) ##### how else can you write this
##### how else can you rewrite the next code
print('\n'.join(['\t'.join((c,str(sum(animaldictionary.get(ac,0) for a in alla for ac in ((a,c,),))//12)))for c in sorted(allc)]))

print("="*60)
print("Month with the maximum number of visits for each station:")
print("Station             Month               Number")

print("1")
print("2")

score 3 · Accepted Answer

您指出的两行确实相当混乱。我将尽我所能解释它们，并建议替代实现。

第一个计算alla和的值allc：

(alla,allc,) = (set(s) for s in zip(*animaldictionary.keys()))

这几乎等同于您在上面已经完成的用于构建alla和allc列表的循环。如果你愿意，你可以完全跳过它。但是，让我们解开它在做什么，这样你就可以真正理解它了。

最里面的部分是animaldictionary.keys()。这将返回一个包含字典所有键的可迭代对象。由于其中的键animaldictionary是二值元组，这就是您将从可迭代对象中得到的。keys在大多数情况下处理字典时实际上不需要调用，因为对键视图的操作通常与直接对字典执行相同的操作相同。

继续前进，通过zip使用zip(*keys). 这里发生了两件事。首先，*语法将上面的迭代解包成单独的参数。因此，如果animaldictionary 的键是("a1", "c1), ("a2", "c2"), ("a3", "c3")这zip三个元组作为单独的参数调用。现在，zip所做的是将多个可迭代参数转换为单个可迭代参数，产生一个包含每个值的第一个值的元组，然后生成一个包含每个值的第二个值的元组，依此类推。所以zip(("a1", "c1"), ("a2", "c2"), ("a3", "c3"))会返回一个生成器 yield("a1", "a2", "a3")后跟("c1", "c2", "c3").

下一部分是生成器表达式，它将表达式中的每个值传递zip给set构造函数。这用于消除任何重复。set实例在其他方面也很有用（例如寻找交叉点），但这里不需要。

最后，两组a和c值被分配给变量alla和allc。它们用这些名称（以及相同的内容！）替换您已经拥有的列表。

您已经有了一个替代方案，您可以在其中计算alla并allc作为列表。使用集合可能会稍微高效一些，但对于少量数据来说可能并没有太大关系。另一种更明确的方法是：

alla = set()
allc = set()
for key in animaldict:  # note, iterating over a dict yields the keys!
    a, c = key  # unpack the tuple key
    alla.add(a)
    allc.add(c)

您询问的第二行进行了一些平均，并将结果组合成一个巨大的字符串，然后打印出来。在一行中塞进这么多东西真的是一种糟糕的编程风格。事实上，它做了一些不必要的事情，这使得它更加混乱。就是这样，添加了几个换行符以使其一次全部显示在屏幕上。

print('\n'.join(['\t'.join((c,str(sum(animaldictionary.get(ac,0)
                                      for a in alla for ac in ((a,c,),))//12)
                           )) for c in sorted(allc)]))

这是最里面的部分for ac in ((a,c,),)。这很愚蠢，因为它是一个 1 元素元组的循环。这是一种将元组重命名为的方式(a,c)，ac但它非常混乱且不必要。

如果我们用ac显式写出的元组替换一次使用，则新的最里面的部分是animaldictionary.get((a,c),0). 这是一种特殊的书写方式，但如果不在字典中，则不会animaldictionary[(a, c)]冒引发 a 的风险。相反，对于不存在的键，将返回（传递给）的默认值。KeyError(a, c)0get

该get电话已包含在以下内容中：(getcall for a in alla)。这是一个生成器表达式，它使用键中的给定值从字典中获取所有值c（如果该值不存在，则默认为零）。

下一步是取上一个生成器表达式中的值的平均值sum(genexp)//12：这非常简单，尽管您应该注意使用//for 除法总是向下舍入到下一个整数。如果您想要更精确的浮点值，请使用/.

下一部分是对的调用'\t'.join，其参数是单个(c, avg)元组。这是一个笨拙的结构，可以更清楚地写成c+"\t"+str(avg)or "{}\t{}".format(c, avg)。所有这些都会产生一个包含c值、制表符和上面计算的平均值的字符串形式的字符串。

下一步是列表[joinedstr for c in sorted(allc)]推导（其中joinedstr 是join上一步中的调用）。在这里使用列表推导有点奇怪，因为不需要列表（生成器表达式也可以）。

最后，列表理解与换行符连接并打印：print("\n".join(listcomp))。这很简单。

无论如何，可以通过使用一些变量并在循环中分别打印每一行来以更清晰的方式重写整个混乱：

for c in sorted(allc):
    total_values = sum(animaldictionary.get((a,c),0) for a in alla)
    average = total_values // 12

    print("{}\t{}".format(c, average))

最后，我有一些一般性的建议。

首先，您的数据结构可能不是您对数据的使用的最佳选择。animaldict与其成为带有键的字典，不如(a,c)拥有一个嵌套结构可能更有意义，您可以在其中分别索引每个级别。即，animaldict[a][c]。让第二个字典包含以相反顺序索引的相同值甚至可能是有意义的（例如，一个被索引[a][c]，而另一个被索引[c][a]）。使用这种方法，您可能不需要allaandallc列表进行迭代（您只需直接循环主字典的内容）。

我的第二个建议是关于代码风格。您的许多变量命名不佳，或者是因为它们的名称没有任何含义（例如c），或者名称暗示的含义不正确。最明显的问题是你的key和value变量，它实际上解开了两个密钥（AKAa和c）。在其他情况下，您可以将键和值放在一起，但仅当您迭代字典的items()视图而不是直接在字典上时。

python-3.x - 什么是另一种编写python3 zip的方法

1 回答 1

Related

Reference