python - 使用正则表达式引用组内组

Question

我正在尝试找到一个正则表达式，它将一个以两个相同符号结尾的单词分组，后跟“ter”并将其拆分为两个符号。示例：单词“Letter”应分为“Let”和“ter”。我正在使用python，这就是我到目前为止所得到的：

match = re.search(r'(\w*)((\w)\1(er$))', str)
print match.group(1) #should print 'Let'
print match.group(2) #should print 'ter'

问题是 (\w)\1 没有引用正确的组，因为它是组内的组。这是如何解决的？

提前致谢。

score 7 · Accepted Answer

我正在使用命名组，因为这样可以更轻松地引用它们：

import re
pattern = r"""
          \b(?P<first_part>\w*(?P<splitter>\w))   # matches starting at a word boundary
          (?P<last_part>(?P=splitter)er\b)        # matches the last letter of the first group
                                                  # plus 'er' if followed by a word boundary
          """
matcher = re.compile(pattern, re.X)
print matcher.search('letter').groupdict()
# out: {'first_part': 'let', 'last_part': 'ter', 'splitter': 't'}

score 1 · Accepted Answer

我希望第一组是一切，直到并包括两个相同符号中的第一个，第二组是第二个相同符号，后跟“er”

那将是：

match = re.search(r'(\w*(\w)(?=\2))(\w*er$)', str)

print match.groups()
# -> ('Let', 't', 'ter')

python - 使用正则表达式引用组内组

2 回答 2

Related

Reference