c++ - 将二进制文件读入结构（C++）

Question

因此，我遇到了一些问题，即无法将二进制文件正确读取到我的结构中。结构是这样的：

struct Student
{
    char name[25];
    int quiz1;
    int quiz2;
    int quiz3;
};

它是 37 个字节（25 个字节来自 char 数组，每个整数 4 个字节）。我的 .dat 文件是 185 字节。它是 5 名学生，有 3 个整数成绩。所以每个学生占用 37 个字节（37*5=185）。

它在纯文本格式中看起来像这样：

Bart Simpson          75   65   70
Ralph Wiggum          35   60   44
Lisa Simpson          100  98   91
Martin Prince         99   98   99
Milhouse Van Houten   80   87   79

我可以使用以下代码单独读取每条记录：

Student stud;

fstream file;
file.open("quizzes.dat", ios::in | ios::out | ios::binary);

if (file.fail())
{
    cout << "ERROR: Cannot open the file..." << endl;
    exit(0);
}

file.read(stud.name, sizeof(stud.name));
file.read(reinterpret_cast<char *>(&stud.quiz1), sizeof(stud.quiz1));
file.read(reinterpret_cast<char *>(&stud.quiz2), sizeof(stud.quiz2));
file.read(reinterpret_cast<char *>(&stud.quiz3), sizeof(stud.quiz3));

while(!file.eof())
{
    cout << left 
         << setw(25) << stud.name
         << setw(5)  << stud.quiz1
         << setw(5)  << stud.quiz2
         << setw(5)  << stud.quiz3
         << endl;

    // Reading the next record
    file.read(stud.name, sizeof(stud.name));
    file.read(reinterpret_cast<char *>(&stud.quiz1), sizeof(stud.quiz1));
    file.read(reinterpret_cast<char *>(&stud.quiz2), sizeof(stud.quiz2));
    file.read(reinterpret_cast<char *>(&stud.quiz3), sizeof(stud.quiz3));
}

而且我得到了一个漂亮的输出，但我希望能够一次读取一个完整的结构，而不仅仅是一次读取每个结构的单个成员。这段代码是我认为完成任务所需要的，但是......它不起作用（我将在它之后显示输出）：

*不包括打开文件和结构声明等类似部分。

file.read(reinterpret_cast<char *>(&stud), sizeof(stud));

while(!file.eof())
{
    cout << left 
         << setw(25) << stud.name
         << setw(5)  << stud.quiz1
         << setw(5)  << stud.quiz2
         << setw(5)  << stud.quiz3
         << endl;

    file.read(reinterpret_cast<char *>(&stud), sizeof(stud));
}

输出：

Bart Simpson             16640179201818317312
ph Wiggum                288358417665884161394631027
impson                   129184563217692391371917853806
ince                     175193530917020655191851872800

它唯一没有搞砸的部分是名字，之后它就在山下。我已经尝试了一切，但我不知道出了什么问题。我什至翻遍了我拥有的书籍，但我找不到任何东西。那里的东西看起来像我所拥有的并且它们工作，但出于某种奇怪的原因我的没有。我在第 25 个字节处执行了 file.get(ch)（ch 是一个字符），它返回了 K，它是 75 的 ASCII ......这是第一个测试分数，所以，一切都在它应该在的地方。只是没有正确阅读我的结构。

任何帮助将不胜感激，我只是坚持这个。

编辑： 在收到你们这么多出乎意料和令人敬畏的意见后，我决定接受你们的建议，坚持一次只读一个成员。我通过使用函数让事情变得更干净、更小。 再次感谢您提供如此快速和启发性的意见。非常感谢。

如果您对大多数人不推荐的解决方法感兴趣，请向下滚动到 user1654209 的第三个答案。该解决方法完美无缺，但请阅读所有评论以了解为什么它不受欢迎。

score 10 · Accepted Answer

您的结构几乎肯定已被填充以保持其内容的对齐。这意味着它不会是 37 个字节，并且这种不匹配会导致读数不同步。查看每个字符串丢失 3 个字符的方式，它似乎已被填充到 40 个字节。

由于填充可能位于字符串和整数之间，因此即使第一条记录也无法正确读取。

在这种情况下，我建议不要尝试将您的数据作为二进制 blob 读取，并坚持读取单个字段。它更加健壮，特别是如果您甚至想改变您的结构。

score 4 · Accepted Answer

在没有看到写入数据的代码的情况下，我猜测您按照第一个示例中读取数据的方式写入数据，每个元素一个接一个。那么文件中的每条记录确实会是 37 个字节。

但是，由于出于优化原因编译器填充结构以将成员置于良好的边界上，因此您的结构是 40 字节。因此，当您在一次调用中读取完整的结构时，实际上一次读取了 40 个字节，这意味着您的读取将与文件中的实际记录不同步。

您要么必须重新实现写入以一次性编写完整的结构，要么使用第一种读取方法，一次读取一个成员字段。

score 4 · Accepted Answer

一个简单的解决方法是将结构打包为 1 个字节

使用 gcc

struct __attribute__((packed)) Student
{
    char name[25];
    int quiz1;
    int quiz2;
    int quiz3;
};

使用 msvc

#pragma pack(push, 1) //set padding to 1 byte, saves previous value
struct  Student
{
    char name[25];
    int quiz1;
    int quiz2;
    int quiz3;
};
#pragma pack(pop) //restore previous pack value

编辑：正如用户 ahans 所说：自 2.7.2.3 版（1997 年发布）以来 gcc 支持 pragma pack，因此如果您的目标是 msvc 和 gcc，使用 pragma pack 作为唯一的打包符号似乎是安全的

score 3 · Accepted Answer

正如您已经发现的那样，填充是这里的问题。此外，正如其他人所建议的那样，解决此问题的正确方法是像您在示例中所做的那样单独阅读每个成员。我不认为这比一次性阅读整个内容的性能要高得多。但是，如果您仍想继续阅读一次，您可以告诉编译器以不同的方式进行填充：

#pragma pack(push, 1)
struct Student
{
    char name[25];
    int quiz1;
    int quiz2;
    int quiz3;
};
#pragma pack(pop)

告诉编译器将当前包值保存在#pragma pack(push, 1)内部堆栈上，然后使用包值 1。这意味着您得到 1 个字节的对齐，这意味着在这种情况下根本没有填充。告诉编译器从#pragma pack(pop)堆栈中获取最后一个值并在之后使用它，从而恢复编译器在定义struct.

虽然#pragma通常表示不可移植的、依赖于编译器的特性，但这一特性至少适用于 GCC 和 Microsoft VC++。

score 1 · Accepted Answer

解决这个线程问题的方法不止一种。这是基于使用结构和字符 buf 的联合的解决方案：

#include <fstream>
#include <sstream>
#include <iomanip>
#include <string>

/*
This is the main idea of the technique: Put the struct
inside a union. And then put a char array that is the
number of chars needed for the array.

union causes sStudent and buf to be at the exact same
place in memory. They overlap each other!
*/
union uStudent
{
    struct sStudent
    {
        char name[25];
        int quiz1;
        int quiz2;
        int quiz3;
    } field;

    char buf[ sizeof(sStudent) ];    // sizeof calcs the number of chars needed
};

void create_data_file(fstream& file, uStudent* oStudent, int idx)
{
    if (idx < 0)
    {
        // index passed beginning of oStudent array. Return to start processing.
        return;
    }

    // have not yet reached idx = -1. Tail recurse
    create_data_file(file, oStudent, idx - 1);

    // write a record
    file.write(oStudent[idx].buf, sizeof(uStudent));

    // return to write another record or to finish
    return;
}


std::string read_in_data_file(std::fstream& file, std::stringstream& strm_buf)
{
    // allocate a buffer of the correct size
    uStudent temp_student;

    // read in to buffer
    file.read( temp_student.buf, sizeof(uStudent) );

    // at end of file?
    if (file.eof())
    {
        // finished
        return strm_buf.str();
    }

    // not at end of file. Stuff buf for display
    strm_buf << std::setw(25) << std::left << temp_student.field.name;
    strm_buf << std::setw(5) << std::right << temp_student.field.quiz1;
    strm_buf << std::setw(5) << std::right << temp_student.field.quiz2;
    strm_buf << std::setw(5) << std::right << temp_student.field.quiz3;
    strm_buf << std::endl;

    // head recurse and see whether at end of file
    return read_in_data_file(file, strm_buf);
}



std::string quiz(void)
{

    /*
    declare and initialize array of uStudent to facilitate
    writing out the data file and then demonstrating
    reading it back in.
    */
    uStudent oStudent[] =
    {
        {"Bart Simpson",          75,   65,   70},
        {"Ralph Wiggum",          35,   60,   44},
        {"Lisa Simpson",         100,   98,   91},
        {"Martin Prince",         99,   98,   99},
        {"Milhouse Van Houten",   80,   87,   79}

    };




    fstream file;

    // ios::trunc causes the file to be created if it does not already exist.
    // ios::trunc also causes the file to be empty if it does already exist.
    file.open("quizzes.dat", ios::in | ios::out | ios::binary | ios::trunc);

    if ( ! file.is_open() )
    {
        ShowMessage( "File did not open" );
        exit(1);
    }


    // create the data file
    int num_elements = sizeof(oStudent) / sizeof(uStudent);
    create_data_file(file, oStudent, num_elements - 1);

    // Don't forget
    file.flush();

    /*
    We wrote actual integers. So, you cannot check the file so
    easily by just using a common text editor such as Windows Notepad.

    You would need an editor that shows hex values or something similar.
    And integrated development invironment (IDE) is likely to have such
    an editor.   Of course, not always so.
    */


    /*
    Now, read the file back in for display. Reading into a string buffer
    for display all at once. Can modify code to display the string buffer
    wherever you want.
    */

    // make sure at beginning of file
    file.seekg(0, ios::beg);

    std::stringstream strm_buf;
    strm_buf.str( read_in_data_file(file, strm_buf) );

    file.close();

    return strm_buf.str();
}

调用 quiz() 并接收格式化为显示到 std::cout、写入文件或其他任何内容的字符串。

主要思想是联合中的所有项目都从内存中的相同地址开始。因此，您可以拥有与要写入文件或从文件读取的结构相同大小的 char 或 wchar_t buf。请注意，需要零演员表。代码中没有一个演员表。

我也不必担心填充。

对于那些不喜欢递归的人，对不起。对我来说，用递归解决它更容易，更不容易出错。也许对其他人来说并不容易？递归可以转换为循环。对于非常大的文件，它们需要转换为循环。

对于那些喜欢递归的人来说，这是使用递归的又一个例子。

我并没有声称使用 union 是最好的解决方案。似乎这是一个解决方案。也许你喜欢它？

c++ - 将二进制文件读入结构（C++）

5 回答 5

Related

Reference