1

输入:

!,A,56281,12/12/19,19:34:12,000.0,0,37N22.714,121W55.576,+0013!,A,56281,12/1
2/19,19:34:13,000.0,0,37N22.714,121W55.576,+0013!,A,56281,12/12/19,19:34:14,000.
0,0,37N22.714,121W55.576,+0013!,A,56281,12/12/19,19:34:15,000.0,0,37N22.714,121W
55.576,+0013!,A,56281,12/12/19,19:34:16,000.0,0,37N22.714,121W55.576,+0013!,A,56
281,12/12/19,19:34:17,000.0,0,37N22.714,121W55.576,+0013!,A,56281,12/12/19,19:34
:18,000.0,0,37N22.714,121W55.576,+0013!,A,56281,12/12/19,19:34:19,000.0,0,37N22.

输出:

!,A,56281,12/12/19,19:34:12,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:13,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:14,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:15,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:16,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:17,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:18,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:19,000.0,0,37N22.

'!' 是起始字符,+0013 应该是每一行的结尾(如果存在)。

我遇到的问题:输出如下:

!,A,56281,12/12/19,19:34:12,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/1
2/19,19:34:13,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:14,000.
0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:15,000.0,0,37N22.714,121W

任何帮助将不胜感激......!!!

我的代码:

file_open= open('sample.txt','r') 
file_read= file_open.read() 
file_open2= open('output.txt','w+') 
counter =0 
for i in file_read: 
    if '!' in i: 
        if counter == 1: 
            file_open2.write('\n') 
            counter= counter -1 
        counter= counter +1 
    file_open2.write(i)
4

6 回答 6

2

You can try something like this:

with open("abc.txt") as f:
    data=f.read().replace("\r\n","")  #replace the newlines with ""

    #the newline can be "\n" in your system instead of "\r\n"

    ans=filter(None,data.split("!"))  #split the data at '!', then filter out empty lines
    for x in ans:
        print "!"+x    #or write to some other file
   .....:         
!,A,56281,12/12/19,19:34:12,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:13,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:14,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:15,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:16,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:17,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:18,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:19,000.0,0,37N22.
于 2013-01-18T21:05:33.990 回答
1

Could you just use str.split?

lines = file_read.split('!')

Now lines is a list which holds the split data. This is almost the lines you want to write -- The only difference is that they don't have trailing newlines and they don't have '!' at the start. We can put those in easily with string formatting -- e.g. '!{0}\n'.format(line). Then we can put that whole thing in a generator expression which we'll pass to file.writelines to put the data in a new file:

file_open2.writelines('!{0}\n'.format(line) for line in lines)

You might need:

file_open2.writelines('!{0}\n'.format(line.replace('\n','')) for line in lines)

if you find that you're getting more newlines than you wanted in the output.

A few other points, when opening files, it's nice to use a context manager -- This makes sure that the file is closed properly:

with open('inputfile') as fin:
    lines = fin.read()
with open('outputfile','w') as fout:
    fout.writelines('!{0}\n'.format(line.replace('\n','')) for line in lines)
于 2013-01-18T21:05:23.490 回答
1

只是为了一些差异,这里是一个正则表达式的答案:

import re

outputFile = open('output.txt', 'w+') 
with open('sample.txt', 'r') as f: 
    for line in re.findall("!.+?(?=!|$)", f.read(), re.DOTALL): 
        outputFile.write(line.replace("\n", "") + '\n') 

outputFile.close() 

它将打开输出文件,获取输入文件的内容,并使用!.+?(?=!|$)带有re.DOTALL标志的正则表达式遍历所有匹配项。正则表达式解释及其匹配的内容可以在这里找到:http ://regex101.com/r/aK6aV4

找到匹配后,我们从匹配中删除新行,并将其写入文件。

于 2013-01-18T21:39:04.940 回答
1

另一个选项,使用replace而不是 split,因为您知道每行的开始和结束字符:

In [14]: data = """!,A,56281,12/12/19,19:34:12,000.0,0,37N22.714,121W55.576,+0013!,A,56281,12/1
2/19,19:34:13,000.0,0,37N22.714,121W55.576,+0013!,A,56281,12/12/19,19:34:14,000.
0,0,37N22.714,121W55.576,+0013!,A,56281,12/12/19,19:34:15,000.0,0,37N22.714,121W
55.576,+0013!,A,56281,12/12/19,19:34:16,000.0,0,37N22.714,121W55.576,+0013!,A,56
281,12/12/19,19:34:17,000.0,0,37N22.714,121W55.576,+0013!,A,56281,12/12/19,19:34
:18,000.0,0,37N22.714,121W55.576,+0013!,A,56281,12/12/19,19:34:19,000.0,0,37N22.""".replace('\n', '')

In [15]: print data.replace('+0013!', "+0013\n!")
!,A,56281,12/12/19,19:34:12,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:13,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:14,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:15,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:16,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:17,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:18,000.0,0,37N22.714,121W55.576,+0013
!,A,56281,12/12/19,19:34:19,000.0,0,37N22.
于 2013-01-18T21:10:39.900 回答
0

I will actually implement as a generator so that you can work on the data stream rather than the entire content of the file. This will be quite memory friendly if working with huge files

>>> def split_on_stream(it,sep="!"):
    prev = ""
    for line in it:
        line = (prev + line.strip()).split(sep)
        for parts in line[:-1]:
            yield parts
        prev = line[-1]
    yield prev


>>> with open("test.txt") as fin:
    for parts in split_on_stream(fin):
        print parts



,A,56281,12/12/19,19:34:12,000.0,0,37N22.714,121W55.576,+0013
,A,56281,12/12/19,19:34:13,000.0,0,37N22.714,121W55.576,+0013
,A,56281,12/12/19,19:34:14,000.0,0,37N22.714,121W55.576,+0013
,A,56281,12/12/19,19:34:15,000.0,0,37N22.714,121W55.576,+0013
,A,56281,12/12/19,19:34:16,000.0,0,37N22.714,121W55.576,+0013
,A,56281,12/12/19,19:34:17,000.0,0,37N22.714,121W55.576,+0013
,A,56281,12/12/19,19:34:18,000.0,0,37N22.714,121W55.576,+0013
,A,56281,12/12/19,19:34:19,000.0,0,37N22.
于 2013-01-19T08:39:52.420 回答
0

让我们尝试\n在每个“!”之前添加一个;然后让python分割线:-):

file_read.replace("!", "!\n").splitlines()
于 2013-01-18T21:09:42.037 回答