匹配此表单中的文件。它总是以 InvNo 开头,~EOR~ 是记录结束。
InvNo: 123
Tag1: rat cake
Media: d234
Tag2: rat pudding
~EOR~
InvNo: 5433
Tag1: strawberry tart
Tag5: 's got some rat in it
~EOR~
InvNo: 345
Tag2: 5
Media: d234
Tag5: rather a lot really
~EOR~
它应该成为
IN 123
UR blabla
**
IN 345
UR blibli
**
其中 UR 是一个 URL。我想将 InvNo 作为第一个标签。** 现在是记录结束标记。这有效:
impfile = filename[:4]
media = open(filename + '_earmark.dat', 'w')
with open(impfile, 'r') as f:
HASMEDIA = False
recordbuf = ''
for line in f:
if 'InvNo: ' in line:
InvNo = line[line.find('InvNo: ')+7:len(line)]
recordbuf = 'IN {}'.format(InvNo)
if 'Media: ' in line:
HASMEDIA = True
mediaref = line[7:len(line)-1]
URL = getURL(mediaref) # there's more to it, but that's not important now
recordbuf += 'UR {}\n'.format(URL))
if '~EOR~' in line:
if HASMEDIA:
recordbuf += '**\n'
media.write(recordbuf)
HASMEDIA = False
recordbuf = ''
media.close()
有没有更好、更 Pythonic 的方式?使用 recordbuffer 和 HASMEDIA 标志似乎是老生常谈了。有什么好的或更好的做法的例子或提示吗?
(另外,我愿意为这篇文章提供更中肯的标题)