python - 如何优化这些嵌套循环？

Question

谁能优化这个代码块？它正在工作，但运行速度非常慢。

maxsat = 0
possiblevotes = []
for i in range(1,int(numcats)+1):
    for j in range(1,int(numdogs)+1):
        possiblevotes.append('C' + str(i) + ' ' + 'D' + str(j))
        possiblevotes.append('D' + str(j) + ' ' + 'C' + str(i))
for m in possiblevotes:
    count = 0
    for n in votes:
        if m == n:
            count += 1
        elif m.split()[0] == n.split()[0]:
            count += 1
    if count > maxsat:
        maxsat = count

score 1 · Accepted Answer

无需生成所有可能的选票。您可以测试您的实际投票，而无需生成possiblevotes列表，因为您可以轻松计算现有投票是否可行。

您也只真正计算“留下”选票。m == n寻找匹配的“坚持下去”投票并不重要，因为任何正确的“坚持下去”投票m.split()[0] == n.split()[0]也是正确的。所以你不妨放弃第一个计数，只看第二个。

现在您只是在找到最大stay票数。使用 acollections.Counter()使计算事情变得更容易：

import collections

vote_counts = collections.Counter(v.split()[0] for v in votes)

maxsat = vote_counts.most_common(1)[0][1]  # retrieve the most popular count

这会计算出与您的代码计算出的数字相同的数字，但现在我们只需循环一次投票，并且只计算“留”票。

将此与您的循环进行对比，您首先循环numcats * numdogs时间，然后循环numcats * numdogs * 2 * len(votes)时间。那一个因素3 * numcats * numdogs 更大。

如果您必须首先验证投票，您可以使用：

from itertools import ifilter

numcats = int(numcats)
numdogs = int(numdogs)

def validvote(vote):
    stay, go = vote.split()
    cat, dog = sorted((stay, go))
    if (cat[0], dog[0]) != ('C', 'D'):
        return False
    if not (1 >= int(cat[1:]) >= numcats):
        return False
    if not (1 >= int(dog[1:]) >= numdogs):
        return False
    return True

vote_counts = collections.Counter(v.split()[0] for v in ifilter(validvote, votes))

你也可以开始使用go投票：

stay_votes = collections.Counter()
go_votes = collections.Counter()

for vote in ifilter(validvote, votes):
    stay, go = vote.split()
    stay_votes[stay] += 1
    go_votes[go] += 1

现在您可以简单地从留票数中减去go 票（删除任何降至 0 的票数）：

total_votes = stay_votes - go_votes

# Display top 10
for creature, tally in total_votes.most_common(10):
    print('{}: {:>#5d}'.format(creature, tally))

当然，你也可以一次性完成计算：

total_votes = collections.Counter()

for vote in ifilter(validvote, votes):
    stay, go = vote.split()
    total_votes[stay] += 1
    total_votes[go] -= 1

但是将投票记录分开可能对以后的分析很有趣。

score 0 · Accepted Answer

使用字典而不是列表：

possiblevotes = {}
for i in range(1,int(numcats)+1):
    for j in range(1,int(numdogs)+1):
        possiblevotes['C' + str(i) + ' ' + 'D' + str(j)] = 0
        possiblevotes['D' + str(j) + ' ' + 'C' + str(i)] = 0
for n in votes:
    possiblevotes[n] += 1
....

score 0 · Accepted Answer

import re

vote_pattern = re.compile('^(C|D)\d+\s')

votes = ['123', 'A1123', 'cC32', 'C', 'D0', 'C11']
maxsat = sum(0 if vote_pattern.match(vote) is None else 1 for vote in votes)

当然，您可以将这个可怕的总和更改为过滤器之类的东西。

score 0 · Accepted Answer

由于嵌套循环，代码需要很长时间。如果你有 1000 只猫、1000 条狗和 1000 张选票，那么第一组循环运行 1000x1000 次；第二组运行 1000x1000x1000 次。如果我们可以删除嵌套循环，那么您的代码将运行得更快。

我注意到你似乎统计了选票，其中“C1 D3”与“D3 C1”相同。我建议您使用集合模块中的 Counter 类来完成繁重的工作。这是我的解决方案：

import collections

if __name__ == '__main__':
    votes = ['C1 D3', 'D1 C5', 'D3 C1', 'd1 c1', 'c1 d3'] # Example votes

    # Normalize the votes: 'D3 C1' becomes 'C1 D3',
    # 'c1 d3' becomes 'C1 D3'
    normalized_votes = [''.join(sorted(v.upper().split())) for v in votes]

    # Count the votes
    counter = collections.Counter(normalized_votes)

    # Top 10
    print '--- TOP 10 ---'
    for vote, count in counter.most_common(10):
        print count, vote

    # Or print all
    print '--- ALL ---'
    for vote, count in counter.iteritems():
        print count, vote

讨论

该解决方案使用 4 个循环：第一个是派生normalized_votes，第二个是创建计数器变量。最后两个循环处理打印结果。这些循环都不是嵌套的。有人可能会争辩说，Counter该类的实现可能包含嵌套循环，但我相信该类的实现尽可能高效。
一个重要的步骤是使选票标准化，这将大大简化您的计票。虽然我已经在一行中完成了，但您可以将其分解为几个步骤以帮助理解。
1. 第一步是将所有投票转换为大写：'d3 c1' 变为 'D3 C1'。
2. 接下来，我用函数将它们分解成一个列表split()：'D3 C1' 变为 ['D3', 'C1']。
3. 第三步，对每一项进行排序：['D3', 'C1'] 变为 ['C1', 'D3']
4. 最后一步将它们粘在一起： ['C1', 'D3'] 变为 'C1 D3'

python - 如何优化这些嵌套循环？

4 回答 4

讨论

Related

Reference