1

我试图从字符串列表中的每个元素中去除子字符串。我无法弄清楚如何处理具有多个要删除的子字符串(停用词)的字符串的情况。

wines = ("2008 Chardonnay", "Cabernet Sauvignon 2009", "Bordeaux 2005 Cotes du Rhone")
stop_words = ("2005", "2008", "2009", "Cotes du Rhone")
result = []

for wine in wines:
    for stop in stop_words:
        if stop in wine:
            x = wine.replace(stop, "")
            result.append(x)

print result

将 if 语句更改为 for 或 while 会返回垃圾或挂起。有什么建议吗?

4

3 回答 3

3

一点缩进和改变变量可以解决你的问题

for wine in wines:
    glass=wine #Lets pour your wine in a glass
    for stop in stop_words:
        if stop in glass: #Is stop in your glass? 
            #Replace stop in glass and pour it in the glass again
            glass = glass.replace(stop, "") 
    result.append(glass) #Finally pour the content from your glass to result


result
[' Chardonnay', 'Cabernet Sauvignon ', 'Bordeaux  ']

如果你想冒险,你可以使用正则表达式。我相信在这种情况下,正则表达式可能比简单循环更快

>>> for wine in wines:
    result.append(re.sub('('+'|'.join(stop_words)+')','',wine))    

>>> result
[' Chardonnay', 'Cabernet Sauvignon ', 'Bordeaux  ']
>>> 

或将其作为列表理解

>>> [re.sub('('+'|'.join(stop_words)+')','',wine) for wine in wines]
[' Chardonnay', 'Cabernet Sauvignon ', 'Bordeaux  ']
>>> 
于 2012-04-10T16:15:34.217 回答
1
wines = ("2008 Chardonnay", "Cabernet Sauvignon 2009", "Bordeaux 2005 Cotes du Rhone")
stop_words = ("2005", "2008", "2009", "Cotes du Rhone")
result = []

for wine in wines:
    x = wine
    for stop in stop_words:        
        x = x.replace(stop, "")
    result.append(x)

print result

使用regexIMO 会好很多

>>> wines = ("2008 Chardonnay", "Cabernet Sauvignon 2009", "Bordeaux 2005 Cotes du Rhone")
>>> stop_words = ("2005", "2008", "2009", "Cotes du Rhone")
>>> import re
>>> [re.sub('|'.join(stop_words),'',wine) for wine in wines]
[' Chardonnay', 'Cabernet Sauvignon ', 'Bordeaux  ']
于 2012-04-10T16:16:16.103 回答
0

作为单线,考虑到 jamylaks 的使用建议strip()

[reduce(lambda x,y: x.replace(y, "").strip(), stop_words, wine) for wine in wines]

请注意,这在 Python 2.x 中可以正常工作,但在 Python 3 中却不行,因为它reduce()已移至单独的库中。如果您使用的是 Python 3,请执行以下操作:

import functools as ft
[ft.reduce(lambda x,y: x.replace(y, "").strip(), stop_words, wine) for wine in wines]
于 2012-04-10T16:30:32.573 回答