如果您需要纯正则表达式解决方案,则只能使用 .NET 或 Python PyPi 正则表达式来实现,因为您需要正则表达式库通常不具备的两件事:1)从右到左的输入字符串解析和 2)无限宽度的后视。
这是一个 Python 解决方案:
import regex
text="Teaching psychology is the part of educational psychology that refers to school education. As will be seen later, both have the same objective: to study, explain and understand the processes of behavioral change that are produce in people as a consequence of their participation in activities educational What gives an entity proper to teaching psychology is the nature and the characteristics of the educational activities that exist at the base of the of behavioral change studied."
rx = r'(?rus)(?<!\b\1\b.*?)\b(\w+)\b'
print (list(reversed(regex.findall(rx, text))))
查看在线演示。
细节
(?rus)
-r
启用从右到左的输入字符串解析(正则表达式中的所有模式像往常一样从左到右匹配,因此匹配文本不会反转),u
在 Python 2 中用于使\w
Unicode 感知,它是 Python 中的默认选项3、s
是DOTALL修饰符使.
匹配换行符
(?<!\b\1\b.*?)
- 如果紧邻当前位置的左侧,则不匹配,有任何 0+ 个字符,然后与第 1 组中捕获的相同文本(见表达式后面)作为整个单词
\b(\w+)\b
- 一个完整的单词,单词边界内的 1+ 个单词字符。
用于以原始顺序打印单词,因为从右到左的reversed
正则表达式从头到尾匹配它们。