python - python 正则表达式中 [:alpha:] 的简写

Question

[:alpha:]如果我正在制作需要它的 unicode 正则表达式，这相当于什么。

例如，[:word:]它是[\w]

如果我得到一些帮助会很棒。

score 10 · Accepted Answer

对于 Unicode 合规性，您需要使用

regex = re.compile(r"[^\W\d_]", re.UNICODE)

\p{L}当前的 Python 正则表达式引擎不支持Unicode 字符属性（如）。

解释：

\w匹配（如果设置了 Unicode 标志）任何字母、数字或下划线。

[^\W]匹配相同的东西，但是使用否定字符类，我们现在可以减去我们不想包含的字符：

[^\W\d_]匹配任何\w匹配项，但没有数字 ( \d) 或下划线 ( _)。

>>> import re
>>> regex = re.compile(r"[^\W\d_]", re.UNICODE)
>>> regex.findall("aä12_")
['a', 'ä']

score -1 · Accepted Answer

范围内的任何字符：

[A-Za-z]

这是 Python 中最好的简写。

或者你可以这样[A-Z]做ignorecase：re.compile(r'[A-Z]', re.I)

或内联：re.compile(r'(?i)[A-Z]')

对于 unicode：re.compile(r'[A-Z]', re.I|re.U)或re.compile(r'(?iu)[A-Z]')

2 回答 2