我的建议是使用键入 5 位线型代码的字典。字典中的每个值都可以是字段偏移量列表(或(偏移量,宽度)元组),按字段位置索引。
如果您的字段有名称,则使用类而不是列表来存储字段偏移数据可能会很方便。但是,namedtuples
这里可能会更好,因为那时您可以通过其名称或字段位置访问您的字段偏移数据,因此您可以两全其美。
namedtuple
s 实际上是作为类实现的,但是定义一个新namedtuple
类型比创建显式类定义和namedtuples
使用协议要紧凑得多,因此它们比用于存储其属性__slots__
的普通类占用更少的 RAM 。__dict__
这是一种namedtuples
用于存储字段偏移数据的方法。我并不是说下面的代码是最好的方法,但它应该给你一些想法。
from collections import namedtuple
#Create a namedtuple, `Fields`, containing all field names
fieldnames = [
'record_type',
'special',
'communication',
'id_number',
'transaction_code',
'amount',
'other',
]
Fields = namedtuple('Fields', fieldnames)
#Some fake test data
data = [
# 1 2 3 4 5
#012345678901234567890123456789012345678901234567890123
"12455WE READ THIS TOO796445 125997 554777",
"22455 888AND THIS TOO796445 125997 55477778 2 1",
]
#A dict to store the field (offset, width) data for each field in a record,
#keyed by record type, which is always stored at (0, 5)
offsets = {}
#Some fake record structures
offsets['12455'] = Fields(
record_type=(0, 5),
special=None,
communication=(5, 28),
id_number=(33, 6),
transaction_code=(40, 6),
amount=(48, 6),
other=None)
offsets['22455'] = Fields(
record_type=(0, 5),
special=(6, 3),
communication=(9, 18),
id_number=(27, 6),
transaction_code=(34, 6),
amount=(42, 8),
other=(51,3))
#Test.
for row in data:
print row
#Get record type
rt = row[:5]
#Get field structure
fields = offsets[rt]
for name in fieldnames:
#Get field offset data by field name
t = getattr(fields, name)
if t is not None:
start, flen = t
stop = start + flen
data = row[start : stop]
print "%-16s ... %r" % (name, data)
print
输出
12455WE READ THIS TOO796445 125997 554777
record_type ... '12455'
communication ... 'WE READ THIS TOO'
id_number ... '796445'
transaction_code ... '125997'
amount ... '554777'
22455 888AND THIS TOO796445 125997 55477778 2 1
record_type ... '22455'
special ... '888'
communication ... 'AND THIS TOO'
id_number ... '796445'
transaction_code ... '125997'
amount ... '55477778'
other ... '2 1'