python - 我不明白这个简单的循环

Question

“</h1>

假设我们对数据集中最常出现的时区（tz 字段）感兴趣。我们有很多方法可以做到这一点。首先，让我们使用列表推导再次提取时区列表：

In [26]: time_zones = [rec['tz'] for rec in records if 'tz' in rec]
In [27]: time_zones[:10]
Out[27]: [u'America/New_York', u'America/Denver', u'America/New_York', u'America/Sao_Paulo', u'America/New_York', u'America/New_York', u'Europe/Warsaw', u'', u'', u'']

现在，按时区生成计数：

def get_counts(sequence): 
   counts = {}
   for x in sequence: 
      if x in counts:
          counts[x] += 1 
      else:
          counts[x] = 1 
   return counts

”</h1>
这是教科书的摘录，我不太了解用于查找某个时区出现次数的循环。有人可以为我直观地分解它，我是初学者。

跟进问题：

“</h1>

如果我们想要前 10 个时区及其计数，我们必须做一些字典杂技：

def top_counts(count_dict, n=10):
    value_key_pairs = [(count, tz) for tz, count in count_dict.items()]
    value_key_pairs.sort()
    return value_key_pairs[-n:]

”</h1>
引号标记了摘录。有人可以解释一下函数 top_counts 中发生了什么吗？

score 8 · Accepted Answer

def get_counts(sequence):  # Defines the function.
   counts = {}             # Creates an empty dictionary.
   for x in sequence:      # Loops through each item in sequence
      if x in counts:      # If item already exists in dictionary
          counts[x] += 1   # Add one to the current item in dictionary
      else:                # Otherwise...
          counts[x] = 1    # Add item to dictionary, give it a count of 1
   return counts           # Returns the resulting dictionary.

score 1 · Accepted Answer

这里的主要操作是字典查找。

if x in counts:

检查时区是否已计算在内。如果它存在于 counts 字典中，它将增加。如果尚不存在，则创建一个新条目并将其设置为 1。

score 0 · Accepted Answer

这基本上是使用字典（或哈希表）来存储每个时区发生了多少次。每个总数都存储在中counts，由时区字符串键入。这使我们能够快速查找现有计数，以便将其加一。

首先，我们遍历中的每个值sequence：

for x in sequence:

对于每次迭代，x将等于当前值。例如，在第一次迭代中，x将等于America/New_York。

接下来，我们有这个令人困惑的部分：

if x in counts:
   counts[x] += 1 
else:
   counts[x] = 1

由于您无法增加不存在的内容，因此我们需要首先检查该键是否已存在于地图中。如果我们以前从未遇到过那个时区，它就不会存在。因此，我们需要将其初始值设置为，因为我们知道到目前为止它至少1发生过一次。

如果它已经存在（xis in counts），我们只需将该键加一：

counts[x] += 1

希望这现在更有意义！

score 0 · Accepted Answer

给定序列是u'America/New_York', u'America/Denver', u'America/New_York', u'America/Sao_Paulo', u'America/New_York', u'America/New_York', u'Europe/Warsaw', u'', u'', u'']

它会是这样的：

  for x in sequence:    # traverse sequence, "u'America/New_York'" is the first item: 
     if x in counts:    # if "u'America/New_York'" in counts:
        counts[x] += 1  #    counts["u'America/New_York'"] += 1
     else:              # else:
        counts[x] = 1   #    counts["u'America/New_York'"] = 1
                        # and so on...      
  return counts

score 0 · Accepted Answer

该函数get_counts执行以下操作：

对于列表中的每个时区：

检查时区是否已在字典中 ( if x in counts)。
如果是这样，将出现次数增加 1 ( counts[x] += 1)。
如果不是，则将计数初始化为 1 ( counts[x] = 1)。

如果你很好奇，你也可以这样做：

from collections import Counter
ctr = Counter()
for x in sequence:
    ctr[x] += 1

计数器会自动返回 0 丢失的项目，所以你不需要初始化它。

score 0 · Accepted Answer

回复：后续问题。

def top_counts(count_dict, n=10):
    value_key_pairs = [(count, tz) for tz, count in count_dict.items()] # Converts dictionary into a list of tuples, i.e. {'aaa': 1, 'bbb': 12, 'ccc': 4} into [(1, 'aaa'), (12, 'bbb'), (4, 'ccc')]
    value_key_pairs.sort() # Sorts the list. Default comparison function applied to tuples compares first elements first, and only if they are equal looks at second elements.
    return value_key_pairs[-n:] # Returns the slice of the sorted array that has last n elements.

python - 我不明白这个简单的循环

”</h1>
这是教科书的摘录，我不太了解用于查找某个时区出现次数的循环。有人可以为我直观地分解它，我是初学者。

跟进问题：

“</h1>
如果我们想要前 10 个时区及其计数，我们必须做一些字典杂技：

`def top_counts(count_dict, n=10): value_key_pairs = [(count, tz) for tz, count in count_dict.items()] value_key_pairs.sort() return value_key_pairs[-n:]`

”</h1>
引号标记了摘录。有人可以解释一下函数 top_counts 中发生了什么吗？

6 回答 6

python - 我不明白这个简单的循环

”</h1> 这是教科书的摘录，我不太了解用于查找某个时区出现次数的循环。有人可以为我直观地分解它，我是初学者。 跟进问题：

“</h1> 如果我们想要前 10 个时区及其计数，我们必须做一些字典杂技： def top_counts(count_dict, n=10): value_key_pairs = [(count, tz) for tz, count in count_dict.items()] value_key_pairs.sort() return value_key_pairs[-n:]

”</h1> 引号标记了摘录。有人可以解释一下函数 top_counts 中发生了什么吗？

6 回答 6

Related

Reference

”</h1>
这是教科书的摘录，我不太了解用于查找某个时区出现次数的循环。有人可以为我直观地分解它，我是初学者。

跟进问题：

“</h1>
如果我们想要前 10 个时区及其计数，我们必须做一些字典杂技：

`def top_counts(count_dict, n=10): value_key_pairs = [(count, tz) for tz, count in count_dict.items()] value_key_pairs.sort() return value_key_pairs[-n:]`

”</h1>
引号标记了摘录。有人可以解释一下函数 top_counts 中发生了什么吗？