python - 一圈？Python

Question

所以我写了这个给定可能数字的函数，它必须在构成给定数字的可能数字中找到两个数字。但是，我仍在学习 Python（一种非常棒的语言），所以我只能使用有限的一组函数。

我创建了这个函数：

def sumPair(theList, n):

    theList = charCount(theList) #charCount is a function i made to convert the list into a dictionary
    for i in theList:
        for a,b in theList.iteritems():
            print a,b
            if a + i == n:
                if theList[b] > 1:
                    return [i, b]
                if a != i:
                    return [i, b]
        return "[]"
print sumPair([6,3,6,8,3,2,8,3,2], 11)

就像我说的，它会找到两个加起来等于给定数字的数字。charCount是我编写的将数组添加到字典中的函数。

在这个程序中，我确保该值大于 1，以防添加的数字相同。有时，如果它检查 10 的总和并且你给它一个数字 5，它只会将 5 添加到自身并返回 10。这if theList[b] > 1: 就是存在的原因。

为什么我在这里？我的教练对两个循环不满意。我花了 5 个小时排除故障，但一无所获。我需要将此程序转换为单循环程序。

我花了一整天的时间，我不是想让你做我的功课，我只是真的被困住了，我需要你的帮助。我听说我应该检查是否存在密钥来完成这项工作。

score 9 · Accepted Answer

讲师可能不满意您的算法花费的时间比它必须的时间长。试试这个：

for each element x in theList
  if there exists an element y in theList such that x+y = n, you have a match

您需要快速进行“如果存在”测试，这就是您使用字典的目的。一个循环将建立这个字典，第二个循环将搜索。这将花费线性时间与您的 O(n^2) 时间。

您关于 5 与自身匹配的观点是一个很好的观点。您想使用称为多重集或包的数据结构。阅读它，然后以这种方式实现您的代码：

for each element x in theList
  if there exists an element y in theList such that x+y == n:
    if x != y or (x == y and x occurs more than once):
      you have a match

祝你好运。

编辑因为有很多次优解决方案，这里是简单的线性解决方案（它是线性的，因为列表中有 2 个循环，但循环一个接一个。所以，2*n 迭代，O(n)。它非常快。

#!/usr/bin/python2.6

from collections import defaultdict

def sum_list(l, n):
  count = defaultdict(lambda: 0)

  for x in l:      # O(n)
    count[x] += 1  # This can be done in constant time O(1)

  for x in l:      # O(n)
    y = n - x
    if count[y] > 0 and (x != y or count[y] > 1): # O(1)
      return x, y

score 4 · Accepted Answer

您可以单独检查特殊情况。n%2意思是“n mod 2”所以它0是n偶数

def sumPair(theList, n):
    # If n is even, check whether n/2 occurs twice or more in theList
    if n%2 == 0 and theList.count(n/2) > 1:
        return [n/2, n/2]

    theSet = set(theList)
    for i in theSet:
        if n-i in theSet:
            return [i, n-i]
    return []

score 3 · Accepted Answer

用铅笔和纸，甚至只看纸上的数字行来思考问题总是有帮助的。然而，更好的解决方案一开始可能看起来过于复杂，而且它们的优势乍一看可能并不那么明显——请参阅gnibbler 的解决方案（他的回答是我个人的赢家，见下文）。

首先，您需要将一个数字与其他所有数字进行比较。然后是第二个数字，以此类推。当使用天真的方法时，在使用单个处理器时无法避免两个嵌套循环。那么时间复杂度总是 O(n^2) 其中 n 是序列的长度。事实是，一些循环可能隐藏在操作中，in或者list.index()原则上不会使解决方案更好。

想象一下这些数字的笛卡尔积——它由一对数字组成。这样的对有 n^2 个，但就加法运算的交换性质而言，大约有一半是相同的，其中 n 个是有自身的对。这意味着您只需要检查n^2 / 2 - n对。如果它们适合测试，最好避免循环遍历不必要的对，而不是稍后进行测试：

for each first element in theList:
    for each second element in the rest of theList from the checked one on:
        if the first and the second elements give the solution:
            report the result
            possibly early break if only the first should be reported

对选中的列表的其余部分使用切片，在enumerate()第一个（也可能在第二个）循环中使用来了解索引。

尽量减少循环中的操作总是一个好主意。想想内循环体是做的次数最多的。这样您就可以在进入内部循环之前计算搜索到的数字：searched = sum - first. 然后第二个循环加上if可以替换为if searched in the rest of theList:

[在此处出现更完整的解决方案后编辑]

这是找到第一次出现或无的 O(n^2) 解决方案（纯 Python，简单，没有库，内置函数和切片，几行）：

def sumPair(theList, n):
    for index, e in enumerate(theList):     # to know the index for the slicing below
        complement = n - e                  # we are searching for the complement
        if complement in theList[index+1:]: # only the rest is searched
            return e, complement            

print sumPair([6,3,6,8,3,2,8,3,2], 11)

[在 gnibbler 对切片和复制的评论之后添加]

gnibbler 关于切片是正确的。切片是副本。（问题是切片是否没有使用“写入时复制”技术进行优化——我不知道。如果是，那么切片将是一种廉价的操作。）为了避免复制，可以使用该list.index()方法进行测试允许传递起始索引。唯一奇怪ValueError的是，当找不到该项目时它会引发异常。这样，if complement...必须替换为try ... except：

def sumPair2(theList, n):
    for ind, e in enumerate(theList):
        try:
            theList.index(n - e, ind + 1)
            return e, n - e
        except ValueError:
            pass

Gnibbler 的评论让我更多地思考这个问题。事实是，set可以接近 O(1) 来测试它是否包含元素和 O(n) 来构造集合。对于非数字元素（其中集合类型无法实现为位数组），这不是很清楚。当哈希数组发挥作用并且应该使用其他技术解决可能的冲突时，那么质量取决于实现。

如有疑问，请测量。这里 gnibbler 的解决方案稍作修改，与其他解决方案相同：

import timeit

def sumPair(theList, n):
    for index, e in enumerate(theList):
        if n - e in theList[index+1:]:
            return e, n - e

def sumPair2(theList, n):
    for ind, e in enumerate(theList):
        try:
            theList.index(n - e, ind + 1)
            return e, n - e
        except ValueError:
            pass

def sumPair_gnibbler(theList, n):
    # If n is even, check whether n/2 occurs twice or more in theList
    if n%2 == 0 and theList.count(n/2) > 1:
        return n/2, n/2

    theSet = set(theList)
    for e in theSet:
        if n - e in theSet:
            return e, n - e

问题中的原始数字用于第一次测试。n = 1无法找到解决方案时的最坏情况的原因：

theList = [6,3,6,8,3,2,8,3,2]

n = 11
print '---------------------', n
print sumPair(theList, n), 
print timeit.timeit('sumPair(theList, n)', 'from __main__ import sumPair, theList, n', number = 1000)

print sumPair2(theList, n), 
print timeit.timeit('sumPair2(theList, n)', 'from __main__ import sumPair2, theList, n', number = 1000)

print sumPair_gnibbler(theList, n),
print timeit.timeit('sumPair_gnibbler(theList, n)', 'from __main__ import sumPair_gnibbler, theList, n', number = 1000)

n = 1
print '---------------------', n
print sumPair(theList, n), 
print timeit.timeit('sumPair(theList, n)', 'from __main__ import sumPair, theList, n', number = 1000)

print sumPair2(theList, n), 
print timeit.timeit('sumPair2(theList, n)', 'from __main__ import sumPair2, theList, n', number = 1000)

print sumPair_gnibbler(theList, n),
print timeit.timeit('sumPair_gnibbler(theList, n)', 'from __main__ import sumPair_gnibbler, theList, n', number = 1000)

它在我的控制台上产生以下输出：

--------------------- 11
(3, 8) 0.00180958639191
(3, 8) 0.00594907526295
(8, 3) 0.00124991060067
--------------------- 1
None 0.00502748219333
None 0.026334041968
None 0.00150958864789

从那短的数字序列和一个特殊情况来看，就时间复杂度而言，不可能说任何质量。无论如何，gnibbler 的解决方案赢了。

在序列包含唯一值的情况下，gnibbler 的解决方案使用最多的内存。让我们尝试更长的序列，包含 0、1、2、...、9999。n 等于 11 和 3000 代表有解决方案的任务。对于 n 等于 30000 的情况，无法找到这对数字。必须检查所有元素——最坏的情况：

theList = range(10000)

n = 11
print '---------------------', n
print sumPair(theList, n), 
print timeit.timeit('sumPair(theList, n)', 'from __main__ import sumPair, theList, n', number = 100)

print sumPair2(theList, n), 
print timeit.timeit('sumPair2(theList, n)', 'from __main__ import sumPair2, theList, n', number = 100)

print sumPair_gnibbler(theList, n),
print timeit.timeit('sumPair_gnibbler(theList, n)', 'from __main__ import sumPair_gnibbler, theList, n', number = 100)

n = 3000
print '---------------------', n
print sumPair(theList, n), 
print timeit.timeit('sumPair(theList, n)', 'from __main__ import sumPair, theList, n', number = 100)

print sumPair2(theList, n), 
print timeit.timeit('sumPair2(theList, n)', 'from __main__ import sumPair2, theList, n', number = 100)

print sumPair_gnibbler(theList, n),
print timeit.timeit('sumPair_gnibbler(theList, n)', 'from __main__ import sumPair_gnibbler, theList, n', number = 100)

n = 30000
print '---------------------', n
print sumPair(theList, n), 
print timeit.timeit('sumPair(theList, n)', 'from __main__ import sumPair, theList, n', number = 100)

print sumPair2(theList, n), 
print timeit.timeit('sumPair2(theList, n)', 'from __main__ import sumPair2, theList, n', number = 100)

print sumPair_gnibbler(theList, n),
print timeit.timeit('sumPair_gnibbler(theList, n)', 'from __main__ import sumPair_gnibbler, theList, n', number = 100)

请注意，序列要长得多。该测试仅重复 100 次，以便在合理的时间内获得结果。（除非您将时间除以，否则无法将时间与之前的测试进行比较number。）它在我的控制台上显示以下内容：

--------------------- 11
(0, 11) 0.00840137682165
(0, 11) 0.00015695881967
(0, 11) 0.089894683992
--------------------- 3000
(0, 3000) 0.0166750746034
(0, 3000) 0.00966040735374
(0, 3000) 0.12532849753
--------------------- 30000
None 180.328006493
None 163.651082944
None 0.204691100723

在非最坏情况下，gnibbler 的解决方案似乎很慢。原因是它需要经历所有序列的准备阶段。幼稚的解决方案在大约三分之一的第一遍中找到了数字。能说明一切的是最坏的情况。gnibbler 的解决方案大约快 1000 倍，并且对于更长的序列，差异会增加。 Gnibbler 的解决方案显然是赢家。

score 2 · Accepted Answer

我的评论：

如果将列表转换为字典，请将名称从更改为theList其他名称。theList有一个名为保存字典的变量是令人困惑的。
如果这两个数字总是彼此相邻，您可以尝试编写一个循环，将索引变量设置i为 0，然后递增i，检查是否theList[i] + theList[i + 1]等于所需的数字。
如果这两个数字可能找不到彼此相邻，那就更棘手了。最明显的方法是使用两个循环：一个是依次查看每个数字，另一个是查看后面的数字，看它们是否与目标相加。如果这两个数字不必彼此相邻，并且讲师希望您只使用一个循环，那么您将需要使用一些东西来保存列表值（如字典）或可能使用“隐式环形”。

“隐式循环”是什么意思？Python 提供了一个运算符，in它会告诉你一个对象是否在 Python 列表中。这是通过循环遍历列表来工作的，但是您不编写循环；循环是在 Python 中为您完成的。

所以，你可以这样做：依次查看每个数字。从目标值中减去数字，然后用于in查看计算的值是否在列表的其余部分中。要搜索列表的其余部分并跳过当前值，请使用“列表切片”。如果您还没有学习切片，那么您的讲师可能不会寻找列表切片的答案。

这是一个例子。如果您已i设置为 0，并且您正在查看列表中的第一个条目，则值为 6。目标值为 11。因此计算的值为 (11 - 6) 或 5。然后您将检查是否computed_value in theList. 要仅查看列表的其余部分，您可以使用切片，然后您将拥有computed_value in theList[1:]. 如果你有一个索引变量i，你会有类似的东西computed_value = 11 - theList[i]，然后检查是否computed_value in theList[i:].

不要忘记，可以使用for循环来创建从 0 到列表长度的索引，并使用索引索引到列表中。在 Python 中，通常最好使用for x in lst:which 集合x来表示列表中的连续对象，但有时您会遇到一个问题，即先使用for i in xrange(len(lst)):然后使用lst[i]orlst[i+1]或lst[i:]or 什么是有用的。
使用老师想要的任何编码风格。如果您的老师想要“theList”之类的“camelCase”名称，请执行此操作。但是 Python 的常用样式称为“PEP 8”，该样式中的变量名是带有下划线的小写字母，例如“the_list”。 http://www.python.org/dev/peps/pep-0008/

score 2 · Accepted Answer

纯 python - 不使用库：

def combinations(lst):
    l = lst[:] # copy of source list
    while len(l) > 1:
        a = l.pop(0)
        for b in l:
           yield a,b

def first_pair(iterator):
    for i in iterator:
       # just return first element
       return i

def sum_pair(lst, sum_val):
    return first_pair((a,b) for a,b in combinations(lst) if (a+b) == sum_val)

print sum_pair([6,3,6,8,3,2,8,3,2], 11)
# result (3,8)

使用迭代工具：

from itertools import combinations, islice

def sum_pair(lst, sum_val):
    return list(islice(((a,b)
        for a,b in combinations(lst, 2) if (a+b) == sum_val), 0, 1))

print sum_pair([6,3,6,8,3,2,8,3,2], 11)
# result [(3,8)]

score 2 · Accepted Answer

这是一个非常好的问题，我曾经被问过一种方法来在线性时间和恒定空间中做到这一点，直到今天我不知道如何实现这一点。

这是一个简单的实现，我确定有一种方法可以使用缓存使它更快一点，但那是假设列表中的每个整数都不唯一，如果是，我认为缓存不会有帮助......

def get_sum_pairs(sum = None, list_of_numbers = None):
    assert sum != None and list_of_numbers != None
    list_of_numbers = sorted(list_of_numbers) # sort the list of numbers O(n log n)        
    for index, number in enumerate(list_of_numbers): # search for each number that is less than the sum O(n)
        if number < sum: # if number greater then sum, theres nothing we can do.
            for index, number_1 in enumerate(list_of_numbers[(index + 1):]): # search the second list, this isn't exactly O(n) since its incremented being incremented
                if number + number_1 == sum: # found a solution.
                    return [(number, number_1)]
                if (number_1 > sum) or (number + number_1 > sum): # if number greater then sum, theres nothing we can do.
                    break                                       # if the addition of two sorted numbers is greater then sum, then theres no need to keep searching since the rest will also be greater, since their sorted.
        else:
            break
    return [()]

唯一的一种方法是使用某种数学公式或技巧，不幸的是我没有。

在测量运行时间时，我们在很大程度上考虑了最坏的情况，在这种情况下，它将是一个数字列表，其中每个数字都小于总和并且是唯一的。

所以有 n 个唯一数字都小于总和，通过排序我们可以减少检查次数，因为我们只需要向前移动 1 + 2 == 2 + 1 :) 但我们仍然需要检查 2 + 3, 3 + 4 依此类推...但还要注意，如果检查的总和大于给出总和，我们也可以停止，因为总和将增加... :)

这里有一些测试......

assert all([get_sum_pairs(**test[0]) == test[1] for test in
     [({'list_of_numbers':[6,3,6,8,3,2,8,3,2], 'sum':11}, [(3, 8)]),
     ({'list_of_numbers':[1,2,3,4,1,2], 'sum':1}, [()]),
     ({'list_of_numbers':[1,2,3,1,23,1,23,123], 'sum':124}, [(1, 123)]),
     ({'list_of_numbers':[1,2,3,12,3,2,1,23,4,1,23,4,5,12332], 'sum':14}, [(2, 12)]),
     ({'list_of_numbers':[-1,2,-2, -3, 1, 2, 3, 2, -1.3], 'sum':1}, [(-1, 2)])
     ] ])

score 2 · Accepted Answer

我不确定@samy.vilar 的“恒定空间”，但这是一个线性时间和空间的解决方案（与n虽然成比例而不是与len(numbers)）：

def sumpairs(numbers, n):
    numbers = [None] + numbers + [None] * (n-len(numbers))
    for k in range(len(numbers)):
        a = numbers[k]
        if a is None or a==k: continue
        if numbers[n-a]==n-a:
            return a, n-a
        numbers[k] = None
        while numbers[a] != a and a is not None:
            b = n-a
            if numbers[a] is None:
                numbers[a] = a
                break
            if numbers[b]==b:
                return a, n-a
            numbers[a], a = a, numbers[a]

print(sumpairs([6,3,6,8,3,2,8,3,2], 16))
print(sumpairs([6,3,6,8,3,2,8,3,2], 11))
print(sumpairs([6,3,5,8,3,2,8,3,2], 10))
print(sumpairs([6,3,5,8,3,2,8,3,2], 5))
print(sumpairs([6,3,5,8,3,2,8,3,2], 12)) # This should fail.

它通过将每个数字移动到列表中的相应位置来工作（我添加了一个前导None以获得基于 1 的索引）。复杂性有点棘手：有一个嵌套循环，但由于每个数字的位置最多会改变一次，整个过程仍然是 O(n)。

n与数字列表的长度相比，该解决方案当然很糟糕。这是一个恒定空间的解决方案（如果您销毁输入，否则您需要它的单个副本）和 O(n log n) 时间，n 是输入的长度：

def sumpairs(numbers, n):
    numbers = sorted(numbers)
    low = 0
    while low < len(numbers)-1:
        t = numbers[low] + numbers[-1]
        if t > n:
            numbers.pop(-1)
        elif t < n:
            low += 1
        else:
            return numbers[low], numbers[-1]

print(sumpairs([6,3,6,8,3,2,8,3,2], 16))
print(sumpairs([6,3,6,8,3,2,8,3,2], 11))
print(sumpairs([6,3,5,8,3,2,8,3,2], 10))
print(sumpairs([6,3,5,8,3,2,8,3,2], 5))
print(sumpairs([6,3,5,8,3,2,8,3,2], 12)) # This should fail.

python - 一圈？Python

7 回答 7

Related

Reference