python - 在不使用附加缓冲区的情况下从 Python 中的字符串中删除重复项

Question

我想在 Python 中解决这个问题：

given a string (without spacing), remove the duplicates without using an adittional buffer.

我有以下代码：

 def removedup(st):
 temp = []
 for i in range(len(st)):
         if st[i] not in temp:
                 temp.append(st[i])
 return temp

它返回一个没有重复的列表。

1-这个代码在 O(n^2) 对吗？

2-如何在不使用 python 中的额外缓冲区的情况下做同样的事情？（我的意思是不使用列表）。也许我可以使用字符串（不是列表），但不确定这是否会增加复杂性。另外，python 中的字符串是不可变的，所以我不能做某种类型的索引来改变一些东西。（就像在 C++ 或 Java 中一样）。

在 Python 中解决此问题的最佳方法是什么？我知道这里有一些“看起来”重复的问题，但我的问题与 Python 更相关（在没有额外缓冲区的情况下解决这个问题）。

谢谢！

score 5 · Accepted Answer

1) 是的。

2) 好

return set(st)

..是迄今为止唯一化字符串（或任何可迭代）的最简单方法。我不知道您是否认为这是“附加缓冲区”。无论如何，您都需要为另一个对象分配一些额外的内存，因为正如您所说，字符串是不可变的。

这当然不会保持秩序，如果这是一个问题，那么总是很明显：

from collections import OrderedDict

return ''.join(OrderedDict.fromkeys(st))

score 1 · Accepted Answer

0) 显然你必须使用至少一个额外的缓冲区，因为正如你所提到的，python 字符串是不可变的，你至少需要以某种方式返回结果，对吧？因此在内部至少已经使用了一个缓冲区（即使您使用相同的名称命名它）。

当然，您可以使用字符串作为缓冲区，它们可以使用字符串 + 字符串或字符串 += 字符串，甚至可以使用字符串 [:n-1] + 字符串 [n:]，但由于不可变性，它在内部会创建新对象时间。

您可以使用其他一些可变的、可迭代的而不是字符串，所以它会起作用。

1) 不，您的代码不是 O(N**2)。在最坏的情况下（所有符号都是唯一的）是 O(N*log(N))，在最好的情况下是 O(N)（所有符号只是一个符号）。

2）假设您使用列表而不是字符串，您可以执行以下操作：

def dup_remove(lst):
    i = 0
    n = len(lst)
    while i < n:
        if lst[i] in lst[:i]:
            del lst[i]
            n -= 1
        else:
            i += 1
        return lst

在最坏的情况下它仍然是 O(N*Log(N)) ，但它不使用任何额外的缓冲区，这是你首先想要的。我认为出于实际目的，使用 OrderedDict 的解决方案应该是更优化的。

score 0 · Accepted Answer

另一种通过列表切片循环实现的方法。

# O(n ^ 2)
for item in input_list[:]:
    index = input_list.index(item) + 1
    while index < len(input_list):
        if input_list[index] == item:
            del input_list[index]
        index += 1

由于 slice 创建了一个副本，如果你真的想要一个没有任何内部缓冲区的解决方案，这将是可行的。

# O(n ^ 2)
i = 0
while i < len(input_list):
    j = i + 1
    while j < len(input_list):
        if input_list[j] == input_list[i]:
            del input_list[j]
            # Don't increment j here since the next item
            # after the deleted one will move to index j
        else:
            j += 1
    i += 1

score 0 · Accepted Answer

1) 我不确定。

2）下面编码了一种非常有效的方法。请注意，我不使用任何额外的包。我什至不使用列表，只是一个字符串！

def removeDuplicate (input):   

i = 0
while i < len(input)-1:
    j = i + 1
    while j < len(input):
        if input[j] == input[i]:                
            input_list = input_list[0:j] + input_list[j+1:]                
            # Don't increment j here since the next item
            # after the deleted one will move to index j
        else:
            j += 1
    i += 1

return input

python - 在不使用附加缓冲区的情况下从 Python 中的字符串中删除重复项

4 回答 4

Related

Reference