5

我有一个逗号分隔的字符串,如何以 pythonic 方式删除字符串中的重复条目。

例如,字符串"a,a,b"应更改为"a,b".

4

5 回答 5

13

元素的顺序重要吗?如果没有,最简单的方法是创建一个set

result = ','.join(set(text.split(',')))

但正如我所说,这不会保留原始字符串的顺序:

>>> text = 'b,a,b'
>>> ','.join(set(text.split(',')))
'a,b'
于 2013-09-04T11:08:20.780 回答
6

如果顺序很重要,您可以使用OrderedDict

>>> from collections import OrderedDict
>>> s = "a,a,b"
>>> ",".join(OrderedDict.fromkeys(s.split(',')))
'a,b'

请注意,这还将处理不相邻的重复项:

>>> s = "a,b,a,a,a,b"
>>> ",".join(OrderedDict.fromkeys(s.split(',')))
'a,b'
于 2013-09-04T11:11:02.743 回答
0

这应该可以解决问题:

list(set(['a','a','b']))
于 2013-09-04T11:10:52.877 回答
0

实际上,您还没有充分说明您想要什么。正如每个人都指出的那样,顺序重要吗?您要删除所有重复项,还是仅删除相同的字符串?

如果顺序无关紧要,所有set解决方案都可以。如果是这样,则有针对这些情况的itertools 配方:

from itertools import ifilterfalse, imap, groupby
from operator import itemgetter

def unique_everseen(iterable, key=None):
    "List unique elements, preserving order. Remember all elements ever seen."
    # unique_everseen('AAAABBBCCDAABBB') --> A B C D
    # unique_everseen('ABBCcAD', str.lower) --> A B C D
    seen = set()
    seen_add = seen.add
    if key is None:
        for element in ifilterfalse(seen.__contains__, iterable):
            seen_add(element)
            yield element
    else:
        for element in iterable:
            k = key(element)
            if k not in seen:
                seen_add(k)
                yield element

def unique_justseen(iterable, key=None):
    "List unique elements, preserving order. Remember only the element just seen."
    # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
    # unique_justseen('ABBCcAD', str.lower) --> A B C A D
    return imap(next, imap(itemgetter(1), groupby(iterable, key)))

您可以将其中任何一个应用于'a,a,b'.split(',')

In [6]: ','.join(set('a,a,b'.split(',')))
Out[6]: 'a,b'

In [7]: ','.join(unique_justseen('a,a,b'.split(',')))
Out[7]: 'a,b'

In [8]: ','.join(unique_everseen('a,a,b'.split(',')))
Out[8]: 'a,b'

或者,对于它们不同的情况:

In [9]: ','.join(set('a,a,b,a'.split(',')))
Out[9]: 'a,b'

In [10]: ','.join(unique_everseen('a,a,b,a'.split(',')))
Out[10]: 'a,b'

In [11]: ','.join(unique_justseen('a,a,b,a'.split(',')))
Out[11]: 'a,b,a'
于 2013-09-04T11:20:06.667 回答
0

嘿,只需使用这个 Java 8 语法:

 String words = "hello,hii,hii,bye,hii,word,World";
        words = Arrays.stream(words.split(",")).distinct().collect(Collectors.joining(","));

输出:

words: hello,hii,bye,word,World
于 2020-02-18T12:36:51.157 回答