0

我正在尝试编写一个接受字符串(句子)然后清理它并返回所有字母、数字和连字符的函数。但是代码似乎出错了。请知道我在这里做错了什么。

示例:Blake D'souza is an !d!0t
应该返回:Blake D'souza is an d0t

Python:

def remove_unw2anted(str):
    str = ''.join([c for c in str if c in 'ABCDEFGHIJKLNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890\''])
    return str

def clean_sentence(s):
    lst = [word for word in s.split()]
    #print lst
    for items in lst:
        cleaned = remove_unw2anted(items)
    return cleaned

s = 'Blake D\'souza is an !d!0t'
print clean_sentence(s)
4

2 回答 2

5

你只返回最后一个干净的词!

应该:

def clean_sentence(s):
    lst = [word for word in s.split()]

    lst_cleaned = []
    for items in lst:
        lst_cleaned.append(remove_unw2anted(items))
    return ' '.join(lst_cleaned)

一个更短的方法可能是这样的:

def is_ok(c):
    return c.isalnum() or c in " '"

def clean_sentence(s):
    return filter(is_ok, s)

s = "Blake D'souza is an !d!0t"
print clean_sentence(s)
于 2013-02-02T15:35:21.437 回答
1

使用string.translate哪个有好处的变体易于扩展并且是string.

import string

allchars = string.maketrans('','')

tokeep = string.letters + string.digits + '-'

toremove = allchars.translate(None, tokeep)

s = "Blake D'souza is an !d!0t"

print s.translate(None, toremove)

输出:

BlakeDsouzaisand0t

OP 说只保留字符、数字和连字符 - 也许他们也意味着保留空格?

于 2013-02-02T19:14:09.493 回答