python - 如何区分python中的“字符串”和“实际代码”？

Question

我的作品与 Python 代码中代码片段的检测有关。所以在我的工作中，我会在 python 中编写一个脚本，这样我就可以将另一个 python 文件作为输入，并在我的脚本所需的位置插入任何必要的代码。

以下代码是我将要检测的文件的示例代码：

A.py #normal un-instrumented code

statements
....
....

def move(self,a):
    statements
    ......
    print "My function is defined" 
    ......

statements 
......

我的脚本实际上是检查 A.py 中的每一行，如果有“def”，则在 def 函数的代码之上检测一个代码片段

以下示例是最终输出应如何：

A.py #instrumented code

statements
....
....

@decorator    #<------ inserted code
def move(self,a):
    statements
    ......
    print "My function is defined" 
    ......

statements 
......

但是我得到了不同的输出。以下代码是我得到的最终输出：

A.py #instrumented 代码

statements
....
....

@decorator    #<------ inserted code
def move(self,a):
    statements
    ......
    @decorator #<------ inserted code [this should not occur]
    print "My function is defined" 
    ......

statements 
......

我可以理解，在检测代码中，它在“定义”一词中识别“def”，因此它检测了上面的代码。

实际上，检测代码有很多这些问题，我无法正确检测给定的 python 文件。有没有其他方法可以区分实际的“def”和字符串？

谢谢

score 3 · Accepted Answer

使用ast模块正确解析文件。

def此代码打印每个语句的行号和列偏移量：

import ast
with open('mymodule.py') as f:
    tree = ast.parse(f.read())
for node in ast.walk(tree):
    if isinstance(node, ast.FunctionDef):
        print node.lineno, node.col_offset

score 0 · Accepted Answer

您可以使用正则表达式。为避免def内部引号，您可以使用负面环视：

import re

for line in open('A.py'):
    m = re.search(r"(?!<[\"'])\bdef\b(?![\"'])", line)
    if m:
        print r'@decorator    #<------ inserted code' 

    print line

但是，可能还有其他def你或我想不到的情况，如果我们不小心，我们最终会重新编写 Python 解析器。从长远来看，@Janne Karila 的使用建议ast.parse可能更安全。

python - 如何区分python中的“字符串”和“实际代码”？

2 回答 2

Related

Reference