3

我正在尝试在 Python 中使用 RegEx 来解析函数定义,而不是其他任何东西。我一直遇到问题。RegEx 是在这里使用的正确工具吗?

IE

def foo():
  print bar
-- Matches --

a = 2
def foo():
  print bar
-- Doesn't match as there's code above the def --

def foo():
  print bar
a = 2
-- Doesn't match as there's code below the def --

我试图解析的字符串的一个例子是"def isPalindrome(x):\n return x == x[::-1]". 但实际上可能包含高于或低于 def 本身的行。

我必须使用什么正则表达式来实现这一点?

4

2 回答 2

8

不,正则表达式不是这项工作的正确工具。这类似于人们拼命地试图用正则表达式解析 HTML。这些语言不规则。因此,您无法解决您将遇到的所有怪癖。

使用内置解析器模块,构建解析树,检查定义节点并使用它们。ast使用该模块会更好,因为它使用起来更方便。一个例子:

import ast

mdef = 'def foo(x): return 2*x'
a = ast.parse(mdef)
definitions = [n for n in ast.walk(a) if type(n) == ast.FunctionDef]
于 2013-03-01T12:40:45.577 回答
2
reg = re.compile('((^ *)def \w+\(.*?\): *\r?\n'
                 '(?: *\r?\n)*'
                 '\\2( +)[^ ].*\r?\n'
                 '(?: *\r?\n)*'
                 '(\\2\\3.*\r?\n(?: *\r?\n)*)*)',
                 re.MULTILINE)

编辑

import re
script = '''
def foo():
  print bar

a = 2
def foot():
  print bar

b = 10
"""
opopo =457
def foor(x):


  print bar
  print x + 10
  def g(u):
    print

  def h(rt,o):
    assert(rt==12)
a = 2
class AZERT(object):
   pass
"""


b = 10
def tabulae(x):


\tprint bar
\tprint x + 10
\tdef g(u):
\t\tprint

\tdef h(rt,o):
\t\tassert(rt==12)
a = 2


class Z:
    def inzide(x):


      print baracuda
      print x + 10
      def gululu(u):
        print

      def hortense(rt,o):
        assert(rt==12)



def oneline(x): return 2*x


def scroutchibi(h%,n():245sqfg srot b#

'''

.

reg = re.compile('((?:^[ \t]*)def \w+\(.*\): *(?=.*?[^ \t\n]).*\r?\n)'
                 '|'
                 '((^[ \t]*)def \w+\(.*\): *\r?\n'
                 '(?:[ \t]*\r?\n)*'
                 '\\3([ \t]+)[^ \t].*\r?\n'
                 '(?:[ \t]*\r?\n)*'
                 '(\\3\\4.*\r?\n(?: *\r?\n)*)*)',
                 re.MULTILINE)

regcom = re.compile('("""|\'\'\')(.+?)\\1',re.DOTALL)


avoided_spans = [ma.span(2) for ma in regcom.finditer(script)]

print 'eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee'
for ma in  reg.finditer(script):
    print ma.group(),
    print '--------------------'
    print repr(ma.group())
    print
    try:
        exec(ma.group().strip())
    except:
        print "   isn't a valid definition of a function"
    am,bm = ma.span()
    if any(a<=am<=bm<=b for a,b in avoided_spans):
        print '   is a commented definition function' 

    print 'eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee'

结果

eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
def foo():
  print bar

--------------------
'def foo():\n  print bar\n\n'

eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
def foot():
  print bar

--------------------
'def foot():\n  print bar\n\n'

eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
def foor(x):


  print bar
  print x + 10
  def g(u):
    print

  def h(rt,o):
    assert(rt==12)
--------------------
'def foor(x):\n\n\n  print bar\n  print x + 10\n  def g(u):\n    print\n\n  def h(rt,o):\n    assert(rt==12)\n'

   is a commented definition function
eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
def tabulae(x):


    print bar
    print x + 10
    def g(u):
        print

    def h(rt,o):
        assert(rt==12)
--------------------
'def tabulae(x):\n\n\n\tprint bar\n\tprint x + 10\n\tdef g(u):\n\t\tprint\n\n\tdef h(rt,o):\n\t\tassert(rt==12)\n'

eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
    def inzide(x):


      print baracuda
      print x + 10
      def gululu(u):
        print

      def hortense(rt,o):
        assert(rt==12)



--------------------
'    def inzide(x):\n\n\n      print baracuda\n      print x + 10\n      def gululu(u):\n        print\n\n      def hortense(rt,o):\n        assert(rt==12)\n\n\n\n'

eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
def oneline(x): return 2*x
--------------------
'def oneline(x): return 2*x\n'

eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
def scroutchibi(h%,n():245sqfg srot b#
--------------------
'def scroutchibi(h%,n():245sqfg srot b#\n'

   isn't a valid definition of a function
eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
于 2013-03-01T13:28:33.780 回答