我必须处理政府提供的数据,这些数据有时会以奇怪的方式被破坏。我的代码已经包含如下片段:
for row in governmental_data:
# XXX Workaround for that one row among thousands
# that was mislabeled by a clerk and will not be fixed
# before form A-320-Tango-5 is completed and submitted
# on the first Sunday after a solstice.
if row is the_spawn_of_satan:
row = fix_row_A320(row)
# XXX end of workaround
process_row(row)
在错误之前只是
for row in governmental_data:
process_row(row)
由于数据是动态的,我无法使用已应用的修复来制作数据的镜像。
随着数量的增加,我可以做些什么来管理这些变通方法?是否有任何最佳实践(除了“不提供损坏的数据”)?