python - 在python中替换字符串

Question

我在 C 代码中有以下序列：

variable == T_CONSTANT

或者

variable != T_CONSTANT

使用 Python，如何分别用SOME_MACRO(variable)或替换这些？!SOME_MACRO(variable)

score 1 · Accepted Answer

一个非常简单且容易出错的方法是使用正则表达式：

>>> s = "a == T_CONSTANT"
>>> import re
>>> re.sub(r"(\w+)\s*==\s*T_CONSTANT", r"SOME_MACRO(\1)", s)
'SOME_MACRO(a)'

类似的正则表达式可用于该!=部分。

score 0 · Accepted Answer

好的，我知道您已经接受了另一个答案。但是，我只是忍不住把它扔掉，因为您的问题是经常出现的问题，并且正则表达式可能并不总是合适的。

这段代码定义了一个微小的有限非递归解析表达式语法，它允许您根据一系列已编译的正则表达式、替代项（不同匹配字符串的元组）和纯字符串来描述您正在搜索的内容。这种格式比计算机语言的正则表达式更方便，因为它看起来类似于语言语法的正式规范。基本上，[varname, ("==", "!="), "T_CONSTANT"]描述了您要查找的内容，并且该action()功能描述了您找到它时想要执行的操作。

我已经包含了一个示例“C”代码的语料库来演示解析器。

import re

# The @match()@ function takes a parsing specification and a list of
# words and tries to match them together. It will accept compiled
# regular expressions, lists of strings or plain strings. 

__matcher = re.compile("x") # Dummy for testing spec elements.
def match(spec, words):
    if len(spec) > len(words): return False

    for i in range(len(spec)):
        if   type(__matcher) is type(spec[i]):
            if not spec[i].match(words[i]): return False
        elif type(()) is type(spec[i]):
            if words[i] not in spec[i]: return False
        else:
            if words[i] != spec[i]: return False

    return True

# @parse()@ takes a parsing specification, an action to execute if the
# spec matches and the text to parse. There can be multiple matches in
# the text. It splits and rejoins on spaces. A better tokenisation
# method is left to the reader...

def parse(spec, action, text):
    words = text.strip().split() 

    n = len(spec)
    out = []
    while(words):
        if match(spec, words[:n+1]): out.append(action(words[:n+1])); words = words[n:]
        else: out.append(words[0]); words = words[1:]

    return " ".join(out)

# This code is only executed if this file is run standalone (so you
# can use the above as a library module...)

if "__main__" == __name__:
    # This is a chunk of bogus C code to demonstrate the parser with:

    corpus = """\
/* This is a dummy. */
variable == T_CONSTANT
variable != T_CONSTANT
/* Prefix! */ variable != T_CONSTANT
variable == T_CONSTANT /* This is a test. */
variable != T_CONSTANT ; variable == T_CONSTANT /* Note contrived placement of semi. */
x = 9 + g;
"""

    # This compiled regular expression defines a legal C/++ variable
    # name. Note "^" and "$" guards to make sure the full token is matched.
    varname = re.compile("^[A-Za-z_][A-Za-z0-9_]*$")

    # This is the parsing spec, which describes the kind of expression
    # we're searching for.
    spec = [varname, ("==", "!="), "T_CONSTANT"]

    # The @action()@ function describes what to do when we have a match.
    def action(words):
        if "!=" == words[1]: return "!SOME_MACRO(%s)" % words[0]
        else:                return "SOME_MACRO(%s)"  % words[0]

    # Process the corpus line by line, applying the parser to each line.
    for line in corpus.split("\n"): print parse(spec, action, line)

如果你运行它，结果如下：

/* This is a dummy. */
SOME_MACRO(variable)
!SOME_MACRO(variable)
/* Prefix! */ !SOME_MACRO(variable)
SOME_MACRO(variable) /* This is a test. */
!SOME_MACRO(variable) ; SOME_MACRO(variable) /* Note contrived placement of semi. */
x = 9 + g;

哦，好吧，我玩得很开心！; - )

python - 在python中替换字符串

2 回答 2

Related

Reference