0

我有以下文字:

ERROR: <C:\Includes\Library1.inc:123> This is the Error
Call Trace:
    <C:\Includes\Library2.inc:456>
    <C:\Includes\Library2.inc:789>
    <C:\Code\Main.ext:12> 
    <Line:1> 
ERROR: <C:\Includes\Library2.inc:2282> Another Error
Call Trace:
    <C:\Code\Main.ext:34>
    <C:\Code\Main.ext:56>
    <C:\Code\Main.ext:78>
    <Line:1> 
ERROR: <C:\Code\Main.ext:90> Error Three

我想提取以下信息:

line, Error = 12, This is the Error
line, Error = 34, Another Error
line, Error = 90, Error Three

这是我走了多远:

theText = 'ERROR: ...'
ERROR_RE = re.compile(r'^ERROR: <(?P<path>.*):(?P<line>[0-9]+)> (?P<error>.*)$')
mainName = '\Main.ext'
# Go through each line
for fullline in theText.splitlines():
    match = self.ERROR_RE.match(fullline)
    if match:
        path, line, error = match.group('path'), match.group('line'), match.group('error')
        if path.endswith(mainName):
            callSomething(line, error)
        # else check next line for 'Call Trace:'
        # check next lines for mainName and get the linenumber
        # callSomething(linenumber, error)

在循环中循环剩余元素的pythonic方法是什么?

解决方案: http ://codepad.org/BcYmybin

4

2 回答 2

1

关于如何遍历剩余行的问题的直接答案是:将循环的第一行更改为

lines = theText.splitlines()
for (linenum, fullline) in enumerate(lines):

然后,在一场比赛之后,您可以通过查看从起始处开始并一直运行到下一场比赛lines[j]的内部循环来获取剩余的行。jlinenum+1

然而,解决这个问题的一个更巧妙的方法是首先将文本分成块。有很多方法可以做到这一点,但是,作为以前的 perl 用户,我的冲动是使用正则表达式。

# Split into blocks that start with /^ERROR/ and run until either the next
# /^ERROR/ or until the end of the string.
#
# (?m)      - lets '^' and '$' match the beginning/end of each line
# (?s)      - lets '.' match newlines
# ^ERROR    - triggers the beginning of the match
# .*?       - grab characters in a non-greedy way, stopping when the following
#             expression matches
# (?=^ERROR|$(?!\n)) - match until the next /^ERROR/ or the end of string
# $(?!\n)   - match end of string.  Normally '$' suffices but since we turned
#             on multiline mode with '(?m)' we have to use '(?!\n)$ to prevent
#             this from matching end-of-line.
blocks = re.findall('(?ms)^ERROR.*?(?=^ERROR|$(?!\n))', theText)
于 2013-06-27T13:26:14.937 回答
0

替换这个:

        # else check next line for 'Call Trace:'
        # check next lines for mainName and get the linenumber
        # callSomething(linenumber, error)

有了这个:

    match = stackframe_re.match(fullline)
    if match and error: # if error is defined from earlier when you matched ERROR_RE
        path, line = match.group('path'), match.group('line')
        if path.endsWith(mainName):
            callSomething(line, error)
            error = None # don't report this error again if you see main again

注意缩进。error = None还要在循环开始之前初始化并error = None在第一次调用callSomething. 一般来说,我建议的代码应该适用于格式正确的数据,但您可能需要对其进行改进,以便在数据与您期望的格式不匹配时不会给出误导性结果。

您将不得不编写 stackframe_re,但它应该是匹配的 RE,例如,

    <C:\Includes\Library2.inc:789>

当您说“在循环中循环其余元素”时,我真的不明白您的意思。默认情况下,循环继续到其余元素。

于 2013-06-27T13:03:18.560 回答