0

有多少个单词包含重复 3 次的两个字母序列?例如,“contentment”和“maintaining”就是这样的词,因为“contentment”的序列“nt”重复了3次,而“maintaining”的序列“in”重复了3次。

这是我的代码:

 len([f for f in file if re.match(r'(.*?[a-z]{2}.*?){3}',f)])
4

2 回答 2

1

这是一个简单的正则表达式:

.*(\w{2}).*\1.*\1

它捕获一个组中的两个字母 ,(\w{2})然后具有相同字母的同一组必须再出现两次\1

这是一个实际的例子:

import re

text = """
How many words contain some two-letter sequence repeated 3 times? For example, "contentment" and "maintaining" are such words because "contentment" has the sequence "nt" repeated three times and "maintaining" has the sequence "in" repeated three times.
"""


def check(word):
    return re.match(r".*(\w{2}).*\1.*\1", word)


def main():
    for word in text.split():
        if check(word):
            print(word)


main()
于 2020-03-13T10:36:03.070 回答
1

您可以使用

\b(?=\w*(\w{2})(?:\w*\1){2})\w+

请参阅正则表达式演示

细节

  • \b- 单词边界
  • (?=\w*(\w{2})(?:\w*\1){2})- 紧随其后的是 0+ 字字符,然后将两个字字符捕获到第 1 组中,然后必须有两次重复任何 0+ 字字符后跟与第 1 组中相同的值
  • \w+- 消耗一个或多个单词字符。

请参阅Python 演示

import re

text = "contentment and maintaining are such words"
print ( [x.group() for x in re.finditer(r'\b(?=\w*(\w{2})(?:\w*\1){2})\w+', text)] )
# =>  ['contentment', 'maintaining']
print ( len([x.group() for x in re.finditer(r'\b(?=\w*(\w{2})(?:\w*\1){2})\w+', text)]) )
# => 2
于 2020-03-13T10:32:27.263 回答