-6

这是示例文本:

ACCESSION NUMBER:           0001054274-12-000001
CONFORMED SUBMISSION TYPE:  D
PUBLIC DOCUMENT COUNT:      1
ITEM INFORMATION:           Rule 506
FILED AS OF DATE:           20120301
DATE AS OF CHANGE:          20120301
EFFECTIVENESS DATE:         20120301

FILER:

COMPANY DATA:   
    COMPANY CONFORMED NAME:               Alliqua, Inc.
    CENTRAL INDEX KEY:                    0001054274
    STANDARD INDUSTRIAL CLASSIFICATION:   SURGICAL & MEDICAL INSTRUMENTS & APPARATUS [3841]
    IRS NUMBER:                           582349413
    STATE OF INCORPORATION:               FL
    FISCAL YEAR END:                      1220A

我正在尝试提取所有变量(登录号、符合的提交类型、...、会计年度结束)并最终将它们写入 .csv 文件。有什么建议么?

4

1 回答 1

3

我会按第一个拆分行:并剥离结果:

data = {}
with open(filename) as inputf:
    for line in inputf:
        if not ':' in line:
            continue
        label, value = map(str.strip, line.split(':', 1))
        if label and value:
            data[label] = value

它输出以下映射:

{'ACCESSION NUMBER': '0001054274-12-000001',
 'CENTRAL INDEX KEY': '0001054274',
 'COMPANY CONFORMED NAME': 'Alliqua, Inc.',
 'CONFORMED SUBMISSION TYPE': 'D',
 'DATE AS OF CHANGE': '20120301',
 'EFFECTIVENESS DATE': '20120301',
 'FILED AS OF DATE': '20120301',
 'FISCAL YEAR END': '1220A',
 'IRS NUMBER': '582349413',
 'ITEM INFORMATION': 'Rule 506',
 'PUBLIC DOCUMENT COUNT': '1',
 'STANDARD INDUSTRIAL CLASSIFICATION': 'SURGICAL & MEDICAL INSTRUMENTS & APPARATUS [3841]',
 'STATE OF INCORPORATION': 'FL'}
于 2013-01-17T16:20:04.253 回答