我需要帮助来解析一个如下所示的文本文件:
WKU 03487472
WKU 3487472
Filed Apr. 30, 1968, Ser. No. 725,329
Int. Cl. A42b 1122
AISD 19700106
WKU D2487471
AISD 19700308
WKU 03487471
Filed J 16, 1969
[51] Int. Cl. A41d 25104
AISD 19700106
我想得到一些 csv 格式的输出:
WKU Filed Int. AISD
03487472 Apr. 30, 1968 A42b 1122 19700106
D2487471 . . 19700308
03487471 J 16, 1969 A41d 25104 19700106
我不是程序员并开始使用python。我尝试了如下脚本:
import csv
import itertools
def is_end_of_record(line):
return line.startswith('WKU')
class FieldClassifier(object):
def __init__(self):
self.field=''
def __call__(self,row):
if not row[0].isspace():
self.field=row.split(' ',1)[0]
return self.field
fields = 'WKU Filed Int. AISD'.split()
with open('C:\Users\Na\Desktop\example.txt', 'r') as infile:
with open('example.csv', 'wb') as oufile:
writer = csv.DictWriter(oufile, fiels=fields)
writer.writerow(dict((h, h) for h in fields))
for end_of_record, lines in itertools.groupby(infile,is_end_of_record):
if not end_of_record:
classifier=FieldClassifier()
record={}
for fieldname, row in itertools.groupby(lines,classifier):
record[fieldname]='; '.join(r.strip() for r in row)
它似乎无法正常工作。如果有人愿意提供帮助或提供任何建议,我将不胜感激。
谢谢,