0

我有一个文本行列表:textlines这是一个字符串列表(以 结尾'\n')。

我想删除多次出现的行,不包括仅包含空格、换行符和制表符的行。

换句话说,如果原始列表是:

textlines[0] = "First line\n"
textlines[1] = "Second line \n"
textlines[2] = "   \n"
textlines[3] = "First line\n"
textlines[4] = "   \n"

输出列表将是:

textlines[0] = "First line\n"
textlines[1] = "Second line \n"
textlines[2] = "   \n"
textlines[3] = "   \n"

怎么做 ?

4

3 回答 3

3
seen = set()
res = []
for line in textlines:
    if line not in seen:
        res.append(line)
        if not line.strip():
            seen.add(line)
textlines = res
于 2013-12-08T21:09:44.180 回答
1

因为我无法抗拒一个好的代码打高尔夫球:

seen = set()

[x for x in textlines if (x not in seen or not x.strip()) and not seen.add(x)]
Out[29]: ['First line\n', 'Second line \n', '   \n', '   \n']

这相当于@hughbothwell 的回答。如果您打算让人类阅读您的代码,您应该使用它:-)

于 2013-12-08T21:34:53.860 回答
0
new = []
for line in textlines:
    if line in new and line.strip():
        continue
    new.append(line)
textlines = new
于 2013-12-08T21:19:26.740 回答