python - Python正则表达式转义与否

Question

我需要写一个正则表达式来获取下面列表中的所有字符..（删除所有不在列表中的字符）

allow_characters = "#.-_abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"

我不知道该怎么做，我什至应该使用 re.match 或 re.findall 或 re.sub ...？

提前非常感谢。

score 7 · Accepted Answer

根本不要使用正则表达式，首先转换allow_characters为一个集合，然后使用''.join()一个生成器表达式来去除不需要的字符。假设您正在转换的字符串称为s：

allow_char_set = set(allow_characters)
s = ''.join(c for c in s if c in allow_char_set)

话虽如此，这就是正则表达式的外观：

s = re.sub(r'[^#.\-_a-zA-Z0-9]+', '', s)

你可以将你的allow_characters字符串转换成这个正则表达式，但我认为第一个解决方案要简单得多。

编辑：正如 DSM 在评论中指出的那样，str.translate()这通常是一种非常好的方法来做这样的事情。在这种情况下，它有点复杂，但您仍然可以像这样使用它：

import string

allow_characters = "#.-_abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
all_characters = string.maketrans('', '')
delete_characters = all_characters.translate(None, allow_characters)

s = s.translate(None, delete_characters)

python - Python正则表达式转义与否

1 回答 1

Related

Reference