2
input1 = input("Corrupted: ")
input2 = ""
final = ""
for i in input1:
  if i in "ATGC ":
    input2 = input2 + i
for i in set(input2.split()):
  final = final + i + " "
print("DNA:",final.rstrip())

该程序的目的是允许用户输入一串文本,其中隐藏着一个 DNA 代码。该程序提取 DNA 代码(基本上是任何不是 ATCG 的代码)。它还删除了重复的整体。它做的一切都是正确的,但它以错误的顺序打印出问题。我会向我的导师寻求帮助,但他目前无法帮助我。

Corrupted: A1TGcC A?T-G %^AT@CT ATGc #Notice the double ATG (2nd and last one)
DNA: ATGC ATCT ATG #Only one ATG since one is removed.

当它打算输出时:

Corrupted: A1TGcC A?T-G %^AT@CT ATGc #This one is in the correct order. How do I get it to stay in the same order?
DNA: ATGC ATG ATCT
4

1 回答 1

4

集合没有任何顺序:

>>> print(set.__doc__)
...
Build an unordered collection of unique elements.

要保留订单,您可以执行以下操作:

>>> lis = [1, 2, 1, 1, 5, 5, 6]
>>> seen = set()
>>> [item for item in lis if item not in seen and not seen.add(item)]
[1, 2, 5, 6]

对于您的代码,您可以使用字符串连接而不是字符串连接,regex因为对于大字符串来说,类似的东西input2 = input2 + i非常慢

>>> import re
>>> corruped = 'A1TGcC A?T-G %^AT@CT ATGc'
>>> lis = re.sub('[^ATGC\s]', '', corruped).split()
>>> lis
['ATGC', 'ATG', 'ATCT', 'ATG']
于 2013-09-07T07:21:17.693 回答