我有一个逗号分隔的字符串,如何以 pythonic 方式删除字符串中的重复条目。
例如,字符串"a,a,b"
应更改为"a,b"
.
元素的顺序重要吗?如果没有,最简单的方法是创建一个set
:
result = ','.join(set(text.split(',')))
但正如我所说,这不会保留原始字符串的顺序:
>>> text = 'b,a,b'
>>> ','.join(set(text.split(',')))
'a,b'
如果顺序很重要,您可以使用OrderedDict
:
>>> from collections import OrderedDict
>>> s = "a,a,b"
>>> ",".join(OrderedDict.fromkeys(s.split(',')))
'a,b'
请注意,这还将处理不相邻的重复项:
>>> s = "a,b,a,a,a,b"
>>> ",".join(OrderedDict.fromkeys(s.split(',')))
'a,b'
这应该可以解决问题:
list(set(['a','a','b']))
实际上,您还没有充分说明您想要什么。正如每个人都指出的那样,顺序重要吗?您要删除所有重复项,还是仅删除相同的字符串?
如果顺序无关紧要,所有set
解决方案都可以。如果是这样,则有针对这些情况的itertools 配方:
from itertools import ifilterfalse, imap, groupby
from operator import itemgetter
def unique_everseen(iterable, key=None):
"List unique elements, preserving order. Remember all elements ever seen."
# unique_everseen('AAAABBBCCDAABBB') --> A B C D
# unique_everseen('ABBCcAD', str.lower) --> A B C D
seen = set()
seen_add = seen.add
if key is None:
for element in ifilterfalse(seen.__contains__, iterable):
seen_add(element)
yield element
else:
for element in iterable:
k = key(element)
if k not in seen:
seen_add(k)
yield element
def unique_justseen(iterable, key=None):
"List unique elements, preserving order. Remember only the element just seen."
# unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
# unique_justseen('ABBCcAD', str.lower) --> A B C A D
return imap(next, imap(itemgetter(1), groupby(iterable, key)))
您可以将其中任何一个应用于'a,a,b'.split(',')
:
In [6]: ','.join(set('a,a,b'.split(',')))
Out[6]: 'a,b'
In [7]: ','.join(unique_justseen('a,a,b'.split(',')))
Out[7]: 'a,b'
In [8]: ','.join(unique_everseen('a,a,b'.split(',')))
Out[8]: 'a,b'
或者,对于它们不同的情况:
In [9]: ','.join(set('a,a,b,a'.split(',')))
Out[9]: 'a,b'
In [10]: ','.join(unique_everseen('a,a,b,a'.split(',')))
Out[10]: 'a,b'
In [11]: ','.join(unique_justseen('a,a,b,a'.split(',')))
Out[11]: 'a,b,a'
嘿,只需使用这个 Java 8 语法:
String words = "hello,hii,hii,bye,hii,word,World";
words = Arrays.stream(words.split(",")).distinct().collect(Collectors.joining(","));
输出:
words: hello,hii,bye,word,World