我在 Python 中将函数应用于 dict 的所有叶子(从 JSON 文件加载)时遇到问题。文本编码错误,我想使用 ftfy 模块来修复它。
这是我的功能:
def recursive_decode_dict(e):
try:
if type(e) is dict:
print('Dict: %s' % e)
return {k: recursive_decode_dict(v) for k, v in e.items()}
elif type(e) is list:
print('List: %s' % e)
return list(map(recursive_decode_dict, e))
elif type(e) is str:
print('Str: %s' % e)
print('Transformed str: %s' % e.encode('sloppy-windows-1252').decode('utf-8'))
return e.encode('sloppy-windows-1252').decode('utf-8')
else:
return e
我这样称呼:
with open('test.json', 'r', encoding='utf-8') as f1:
json_content = json.load(f1)
recursive_decode_dict(json_content)
with open('out.json', 'w', encoding='utf-8') as f2:
json.dump(json_content, f2, indent=2)
控制台输出很好:
> python fix_encoding.py
List: [{'fields': {'field1': 'the European-style café into a '}}]
Dict: {'fields': {'field1': 'the European-style café into a '}}
Dict: {'field1': 'the European-style café into a '}
Str: the European-style café into a
Transformed str: the European-style café into a
但我的输出文件不固定:
[
{
"fields": {
"field1": "the European-style caf\u00c3\u00a9 into a "
}
}
]