c++ - C++ MFC 对象 CArchive 写入的文件格式是什么？

Question

我正在尝试用 C# 读入一个用 CArchive 编写的文件。据我所知，格式是：

[下一组数据的长度][数据]...等

不过，我对一些数据仍然很模糊。如何读取日期数据？浮点数、整数、双精度数等呢？

此外，[下一组数据的长度] 可以是字节或字或双字。我怎么知道什么时候会出现？例如，对于字符串“1.10”，数据为：

04 31 2e 31 30

04显然是长度，其余的是 1.10 的十六进制值。琐碎的。后来我有一个长度为 41 个字符的字符串，但 [length] 值为：

00 00 00 29

为什么长度为 4 个字节？(0x29 = 41)

主要问题是：CArchive 输出的格式是否有规范？

score 8 · Accepted Answer

To answer your question about strings, the length value that is stored in the archive is itself variable-length depending on the length and encoding of its string. If the string is < 255 characters, one byte is used for the length. If the string is 255 - 65534 characters, 3 bytes are used - a 1-byte 0xFF marker followed by a 2-byte word. If the string is 65535+ characters, 7 bytes are used - a 3-byte 0xFF 0xFF 0xFF marker followed by a 4-byte dword. To make it even more complicated, if the string is Unicode encoded, the length value is preceeded by a 3-byte 0xFF 0xFFFE marker. So in any, combination, you will never see a 4-byte length by itself, so what you showed has to be 3 0x00 bytes belonging to something else, followed by a 1-byte string length 0x29.

So, the correct way to read a string is as follows:

Assume: string data is Ansi unless told otherwise.

Read a byte. If its value is < 255, string length is the value, goto 3.
Read a word. If its value is 0xFFFE, string data is Unicode, goto 1. Otherwise, if its value is < 65535, string length is its value, goto 3. Otherwise, read a dword, string length is its value, goto 3.
read string length number of 8bit or 16bit values, depending on whether string is Ansi or Unicode, and then convert to desired encoding as needed.

score 3 · Accepted Answer

根据文档：

主要的 CArchive 实现可以在 ARCCORE.CPP 中找到。

如果您没有 MFC 源，请参阅此。

c++ - C++ MFC 对象 CArchive 写入的文件格式是什么？

2 回答 2

Related

Reference