0

您好,我目前正在尝试解析一个脚本,其中包含类似于下面给出的文件的路径。我想使用正则表达式解析文件并将数据存储到一个字符串中,文件之间用 '\n' 分隔。下面给出的示例文件。

    SAMPLE FILE: ('#' is a comment would like to keep commented out)
    add file -tls "../path1/path2/path3/example_1.edf"
    add file -tls "../path1/path2/path3/example_1.v"
    add file -tls "../path1/path2/path3/exa_4mple_1.sv"
    add file -tls "../path1/path2/path3/example_1.vh"        
    #add file -tls "../path1/path2/path3/exa_0mple_1.vhd"

    SAMPLE OUTPUT: (this example excludes the '\n' character)
    example_1.v
    exa_4mple_1.sv
    example_1.vh
    #exa_0mple_1.vhd

如何构建扩展“re”以使其仅包含上述扩展而排除其他扩展?我还想知道是否可以捕获注释掉的行的“#”,并在文件名前面加上“#”。

    -Desired function:
    for match in re.finditer(r'/([A-Za-z0-9_]+\..+)"', contents):
       mylist.append(match.group(1))

    -Working Code: ( tested on the '.v' file case )
    re.finditer(r'/([A-Za-z0-9_]+\.v)"', contents)
4

2 回答 2

1

这是你想要的吗 ?:

import re

contents = '''
add file -tls "../path1/path2/path3/example_1.edf"
add file -tls "../path1/path2/path3/example_1.v"
add file -tls "../path1/path2/path3/exa_4mple_1.sv"     
add file -tls "../path1/path2/path3/example_1.vh"     
#add file -tls "../path1/path2/path3/exa_0mple_1.vhd"
'''

print contents

pat = "^(#?)add file.+?\"\.\./(?:\w+/)*(\w+?\.\w*v\w*)\"\s*$"

gen = (''.join(mat.groups())
       for mat in re.finditer(pat,contents,re.MULTILINE))

print '\n'.join(gen)

该模式允许捕获带有包含字母“v”的扩展名的路径,这就是我从你的问题中理解的。
根据您的示例,我还将字符串add file作为捕捉标准。
\w在图案中使用过。
使用这种模式,所有路径都应该以../
如果所有这些特性不适合您的情况,我们将更改需要更改的内容。

请注意,我放在\s*模式的末尾,以防路径后面的行中有空格。

于 2013-06-29T09:39:33.210 回答
1

不需要正则表达式:

>>> import os
>>> L = [
... "/path1/path2/path3/example_1.edf", 
... "/path1/path2/path3/example_1.v",
... "/path1/path2/path3/exa_4mple_1.sv", 
... "/path1/path2/path3/example_1.vh" ]
>>> for mypath in L:
...     if mypath.split('.')[-1] in ('v', 'sv', 'vh'):
...             print os.path.split(mypath)[1]
... 
example_1.v
exa_4mple_1.sv
example_1.vh

或作为列表理解:

>>> [os.path.split(mypath)[1] 
... for mypath in L 
... if mypath.split('.')[-1] in ('v', 'sv', 'vh')]
['example_1.v', 'exa_4mple_1.sv', 'example_1.vh']
于 2013-06-29T07:41:23.543 回答