1

我有一个我希望 python 读取的 txt 文件,我希望 python 从中提取两个字符之间的字符串。这是一个例子:

Line a

Line b

Line c

&TESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTEST !

Line d

Line e

我想要的是python读取这些行,当它遇到“&”时,我希望它开始打印行(包括带有“$”的行)直到它遇到“!”

有什么建议么?

4

3 回答 3

4

这有效:

data=[]
flag=False
with open('/tmp/test.txt','r') as f:
    for line in f:
        if line.startswith('&'):
            flag=True
        if flag:
            data.append(line)
        if line.strip().endswith('!'):
            flag=False

print ''.join(data)  

如果您的文件足够小以至于将其全部读入内存不是问题,并且您想要的字符串的开头和结尾没有歧义,那么这更容易&!

with open('/tmp/test.txt','r') as f:
    data=''.join(f.readlines())    

print data[data.index('&'):data.index('!')+1] 

或者,如果您想读取整个文件但仅使用&并且!它们分别位于行的开头和结尾,则可以使用正则表达式:

import re

with open('/tmp/test.txt','r') as f:
    data=''.join(f.readlines())    

m=re.search(r'^(&.*!)\s*?\n',data,re.S | re.M)    
if m: print m.group(1)   
于 2013-07-13T17:19:00.477 回答
0

这是一个(非常简单!)示例。

def Printer():
    f = open("yourfile.txt")
    Pr = False
    for line in f.readlines():
        if Pr: print line
        if "&" in line:
            Pr = True
            print line
        if "!" in line:
            Pr = False
    f.close()
于 2013-07-13T17:10:27.330 回答
0

一种简单的解决方案如下所示。代码包含大量注释,让您理解每一行代码。代码的美妙之处在于,它使用 with 运算符来处理异常并关闭资源(例如文件)。

#Specify the absolute path to the input file.
file_path = "input.txt" 

#Open the file in read mode. with operator is used to take care of try..except..finally block.
with open(file_path, "r") as f:
    '''Read the contents of file. Be careful here as this will read the entire file into memory. 
       If file is too large prefer iterating over file object
    ''' 
    content = f.read()
    size = len(content)
    start =0
    while start < size:
        # Read the starting index of & after the last ! index.
        start = content.find("&",start)
        # If found, continue else go to end of contents (this is just to avoid writing if statements.
        start = start if start != -1 else size
        # Read the starting index of ! after the last $ index.
        end = content.find("!", start)
        # Again, if found, continue else go to end of contents (this is just to avoid writing if statements.
        end = end if end != -1 else size
        '''print the contents between $ and ! (excluding both these operators. 
           If no ! character is found, print till the end of file.
        ''' 
        print content[start+1:end]
        # Move forward our cursor after the position of ! character. 
        start = end + 1
于 2013-07-13T18:02:41.660 回答