python-3.x - Python：搜索正则表达式和通过正则表达式替换的差异

Question

我正在编写一个小脚本，用于使用pywikibot自动更正跨维基翻译链接。我寻找现有的链接，并希望以标准格式重写它们，并带有指向所有页面的链接。

我正在寻找的文字看起来有点像

{{Trad|EN=Under Spring|FR=Sources Interdites|DE=Verbotene Quellen}}

或多行有点像

{{Trad
|DE=Urwurzeln
|EN=Prime Roots
|ES=Raíces Primarias
|FR=Primes Racines
|RU =Изначальных Корней
|H  = 
|palette=primes
}}

我设法通过以下方式在 wiki 页面源中找到这两个实例

reg_strg = '{{trad([\w\s\|\=]*)}}'
rex = re.search(reg_strg, text, re.IGNORECASE | re.MULTILINE)

这让我成为了模板的核心（对于第一种情况）

|EN=Under Spring|FR=Sources Interdites|DE=Verbotene Quellen

并且类似于第二个的多行字符串。

但是，我现在在替换命令中使用相同的 reg_strg，它无法进行任何替换，文本保持不变，new_strg 是根据读取的内容创建的，以构成替换字符串。但结果与 new_strg 是多行字符串还是简单的“flobberigoo”无关

text = re.sub(reg_strg, new_strg, text, re.IGNORECASE | re.MULTILINE)

所以很明显 re.search 和 re.sub 之间存在一些区别 - 但是我在文档中找不到这一点（即使我知道 re.search 和 re.match 之间的区别，我也理解它，即 re. sub 的行为应该像第一个）。

我想念什么？如何用字符串替换我在页面中找到的提到的正则表达式？

为了完整起见，这是包括调试打印在内的完整功能：

def replace_translation_template(self, text, translations):
    """
    @param text The page text to look through
    @param translations dictionary of translations
    """
    reg_strg = '{{trad([\w\s\|\=]*)}}'
    rex = re.search(reg_strg, text, re.IGNORECASE | re.MULTILINE)

    print("Replacing:")
    try:
        print(rex.group(1))
        strgs = rex.group(1).split('|')
        print(strgs)
        new_strg = ""
        for lang,pagename in translations.items():
            if pagename is None:
                pagename = ""
            new_strg += '|' + lang.upper() + '=' + pagename + '\n'

        #print("New_strg: ", new_strg)

        for lang in translations.keys():
            for (n,str) in enumerate(strgs):
                if lang.upper() in str:
                    strgs.pop(n)

        for s in strgs:
            if len(s) > 2:
                new_strg += '|' + s + '\n'
        print(new_strg, '\n')
        print('\n with \n \n')
        text = re.sub(reg_strg, new_strg, text, re.IGNORECASE | re.MULTILINE)
        print(text)

    except:
        print("no text matched:", rex)

python-3.x - Python：搜索正则表达式和通过正则表达式替换的差异

0 回答 0

Related

Reference