python - Python和Rubular之间正则表达式的区别？

Question

在 Rubular 中，我创建了一个正则表达式：

(Prerequisite|Recommended): (\w|-| )*

它匹配粗体：

推荐：计算机和一些艺术的良好舒适度。

夏天。2 学分。先决条件：大一新生资格或导师许可。学分可能不适用于工程学位。仅 SU 成绩。

这是 Python 中正则表达式的用法：

note_re = re.compile(r'(Prerequisite|Recommended): (\w|-| )*', re.IGNORECASE)

def prereqs_of_note(note):
    match = note_re.match(note)
    if not match:
        return None
    return match.group(0)

不幸的是，代码返回None而不是匹配：

>>> import prereqs

>>> result  = prereqs.prereqs_of_note("Summer. 2 credits. Prerequisite: pre-fres
hman standing or permission of instructor. Credit may not be applied toward engi
neering degree. S-U grades only.")

>>> print result
None

我在这里做错了什么？

更新：我需要re.search()代替re.match()吗？

score 2 · Accepted Answer

您想使用re.search()它是因为它会扫描字符串。你不想要re.match()，因为它试图在字符串的开头应用模式。

>>> import re
>>> s = """Summer. 2 credits. Prerequisite: pre-freshman standing or permission of instructor. Credit may not be applied toward engineering degree. S-U grades only."""
>>> note_re = re.compile(r'(Prerequisite|Recommended): ([\w -]*)', re.IGNORECASE)
>>> note_re.search(s).groups()
('Prerequisite', 'pre-freshman standing or permission of instructor')

此外，如果您想匹配“instructor”一词之后的第一个句点，您将不得不添加一个文字“。” 进入你的模式：

>>> re.search(r'(Prerequisite|Recommended): ([\w -\.]*)', s, re.IGNORECASE).groups()
('Prerequisite', 'pre-freshman standing or permission of instructor. Credit may not be applied toward engineering degree. S-U grades only.')

我建议你让你的模式更贪婪，并在该行的其余部分匹配，除非这不是你真正想要的，尽管看起来你这样做了。

>>> re.search(r'(Prerequisite|Recommended): (.*)', s, re.IGNORECASE).groups()
('Prerequisite', 'pre-freshman standing or permission of instructor. Credit may not be applied toward engineering degree. S-U grades only.')

前面添加了文字“.”的模式返回与.*本示例相同的结果。

python - Python和Rubular之间正则表达式的区别？

1 回答 1

Related

Reference