python - How to properly write utf8 IPTC metadata with python library IPTCInfo?

Question

After generating a JPEG thumbnail file with PIL, I would like to use IPTCInfo in order to write IPTC metadata containing french characters with accents. I was thinking about using UTF8 character encoding.

So I tried the following:

info = IPTCInfo(input_file, force=True, inp_charset='utf8')
info.data['credit'] = some_unicode_string
info.saveAs(output_file)

and many other variations:

info = IPTCInfo(input_file, force=True)
info = IPTCInfo(input_file, force=True, inp_charset='utf8')
info = IPTCInfo(input_file, force=True, inp_charset='utf_8')
info = IPTCInfo(input_file, force=True, inp_charset='utf8', out_charset='utf8')
info = IPTCInfo(input_file, force=True, inp_charset='utf_8', out_charset='utf_8')
...

While reading with IPTCInfo the metadata written by IPTCInfo preserves the unicode python string, I always find weird characters when trying to read with other piece of software: OSX file information, Exiftools, PhotoShop, ViewNX2. So what is the right way to write unicode with IPTCInfo and produce a standard compliant file understandable by all software?

score 0 · Accepted Answer

与你的问题有关的东西。来自IPTC 论坛

使用 XMP 数据包让事情变得非常简单：UTF-8 是默认字符集。因此，您可以使用甚至混合不同的字符集和脚本。

IPTC IIM 标头有点棘手：它包含一个字段来指示哪个字符集已用于文本字段（对于 IIM 专家：这是数据集 1:90）但不幸的是，该字段尚未被绝大多数人使用成像软件，只有在最近几年，他们中的一些人才使用它。

同样在IPTC EnvelopeRecord Tags中，您会发现：

90 CodedCharacterSet string[0,32]!

（值以“ESC XY[, ...]”的形式输入。UTF-8 字符编码的转义序列是“ESC % G”，但为方便起见显示为“UTF8”。可以使用任一字符串写的时候，这个标签的值会影响Application和NewsPhoto记录中字符串值的解码，这个标签被标记为“不安全”是为了防止它在组操作中被默认复制，因为目标图像中的现有标签可能会使用不同的编码。从头开始创建新的 IPTC 记录时，如果可能有特殊字符，建议将其设置为“UTF8”）

另请参见-charset CHARSET

某些元信息格式允许使用纯 ASCII 以外的编码字符集。读取时，大多数已知编码会根据 exiftool "-charset CHARSET" 或 -L 选项转换为外部字符集，或默认转换为 UTF-8。写入时，进行逆变换。或者，可以使用 -E 选项将特殊字符转换为 HTML 字符实体/从 HTML 字符实体转换。

虽然IPTC的 IPTCInfo实现代码中的注释不是很令人鼓舞，但代码中仍然有一个编码字典，提供了更多线索。

在您似乎正确的代码示例中，您正在给予。:)

info.data['credit'] = some_unicode_string

你怎么称呼 some_unicode_string？你确定它是一个 utf-8 字符串（！= unicode）。

python - How to properly write utf8 IPTC metadata with python library IPTCInfo?

1 回答 1

Related

Reference