0

In python:

  1. Assignment is not allowed in conditionals.
  2. The state of a regex match is determined based on a returned match object that also contains other match info.

Now say we want to match a particular pattern out of 10 or 15, we end up with something cluttered like this:

m = pat1.match(buffer)
if m:    
    tok = tok1
    val = m.group(0)
else:
    m = pat2.match(buffer)
    if m:
        tok = tok2 
        val = m.group(0)
    else:
        m = pat3.match(buffer)
        if m:
            tok = tok3
            val = m.group(0)
            # extra processing here and there - makes looping unsuitable
        else:
            m = pat4.match(buffer)
            if m:
                tok = tok4
                val = m.group(0)
            else:    
                # ... keep indenting 

We would really like to have something as follows:

if match ... pat1:
    tok = 
    val = 
elif match ... pat2:
    tok = 
    val = 
elif match ... pat3:
    tok = 
    val = 
...

(like can be done in other languages possibly using features like: assignment in conditionals, side effect to a standard match object, a different form of match function with pass by reference args ...)

We can maybe use a loop to run through the patterns, but that wouldn't be suitable if there are variations in the processing for each match.

So: is there any nice pythonic way to keep the match conditionals at the same level?!

4

1 回答 1

4

成对循环标记和模式,因此您可以调整以下内容:

for pat, token in zip([pat1, pat2, pat3], ['tok1', 'tok2', 'tok3']):
    m = pat.match(buffer)
    if m:
        val = m.group(0)
        tok = token1
        break

这个想法是你在模式->值之前建立一个表:

tests = [
    (re.compile('([a-z]{2})'), 'func1'),
    (re.compile('(a{5}'), 'token2')
]

for pattern, token in tests:
    m = pattern.match(buffer)
    if m: 
        # whatever

这可以进一步扩展以提供一个可调用对象,它可以将编译后的对象和缓冲区作为参数,并从中做任何事情并返回一个值。

例如:

def func1(match, buf):
    print 'entered function'
    return int(buf) * 50

tests = [ 
    (re.compile('\d+'), func1)
]

for pattern, func in tests:
    m = pattern.match(buffer)
    if m:
        result = func(m, buffer)
于 2013-08-19T23:06:56.640 回答