1

例如,如果我有一串数字和一个单词列表:

My_number = ("5,6!7,8")
My_word =["hel?llo","intro"]
4

4 回答 4

5

使用str.translate

>>> from string import punctuation
>>> lis = ["hel?llo","intro"]
>>> [ x.translate(None, punctuation) for x in lis]
['helllo', 'intro']
>>> strs = "5,6!7,8"
>>> strs.translate(None, punctuation)
'5678'

使用regex

>>> import re
>>> [ re.sub(r'[{}]+'.format(punctuation),'',x ) for x in lis]
['helllo', 'intro']
>>> re.sub(r'[{}]+'.format(punctuation),'', strs)
'5678'

使用列表理解和str.join

>>> ["".join([c for c in x if c not in punctuation])  for x in lis]
['helllo', 'intro']
>>> "".join([c for c in strs if c not in punctuation])
'5678'

功能:

>>> from collections import Iterable
def my_strip(args):
    if isinstance(args, Iterable) and not isinstance(args, basestring):
        return [ x.translate(None, punctuation) for x in args]
    else:
        return args.translate(None, punctuation)
...     
>>> my_strip("5,6!7,8")
'5678'
>>> my_strip(["hel?llo","intro"])
['helllo', 'intro']
于 2013-06-22T07:12:07.497 回答
3

假设你打算my_number成为一个字符串,

>>> from string import punctuation
>>> my_number = "5,6!7,8"
>>> my_word = ["hel?llo", "intro"]
>>> remove_punctuation = lambda s: s.translate(None, punctuation)
>>> my_number = remove_punctuation(my_number)
>>> my_word = map(remove_punctuation, my_word)
>>> my_number
'5678'
>>> my_word
['helllo', 'intro']
于 2013-06-22T07:11:03.360 回答
1

使用filter + str.isalnum

>>> filter(str.isalnum, '5,6!7,8')
'5678'
>>> filter(str.isalnum, 'hel?llo')
'helllo'
>>> [filter(str.isalnum, word) for word in ["hel?llo","intro"]]
['helllo', 'intro']

这仅适用于 python2。在 python3 中,过滤器总是返回一个可迭代的,你必须这样做''.join(filter(str.isalnum, the_text))

于 2013-06-22T07:21:51.137 回答
1

这是一个 Unicode 感知解决方案。Po是标点符号的 Unicode 类别。

>>> import unicodedata
>>> mystr = "1?2,3!abc"
>>> mystr = "".join([x for x in mystr if unicodedata.category(x) != "Po"])
>>> mystr
'123abc'

您也可以使用正则表达式,使用re模块和re.sub. 遗憾的是,标准库正则表达式模块不支持 Unicode 类别,因此您必须手动指定要删除的所有字符。有一个名为的单独库regex具有这样的功能,但它是非标准的。

于 2013-06-22T07:26:44.987 回答