1

我对 C 编程真的很陌生,我尝试将此作为读取文件并将它们保存到结构的动态数组的示例,txt 的信息是:

Movie id:1448
title:The movie
surname of director: lorez
name of director: john
date: 3
month: september
year: 1997

结构应该是这样的

typedef struct date
{
  int day, month, year;
} date;

typedef struct director_info
{
  char* director_surname, director_name;
} director_info;

typedef struct movie
{
  int id;
  char* title;
  director_info* director;
  date* release_date;
} movie;

我所知道的是我应该阅读它,fgets我认为这是某种方式,但我无法弄清楚我将如何制作结构并保存它们

    FILE *readingText;
    readingText = fopen("movies.txt", "r");

    if (readingText == NULL)
    {
        printf("Can't open file\n");
        return 1;
    }

    while (fgets(line, sizeof(line), readingText) != NULL)
    {
        .....
    }
    fclose(readingText);
4

1 回答 1

3

读取多行输入可能会有点挑战性,并且将其与分配嵌套结构相结合,并且您对文件 I/O 和动态内存分配有很好的学习经验。但在查看你的任务之前,有一些误解需要清理:

char* director_surname, director_name;

不声明两个指向的指针char。它先声明一个指针 ( director_surname),然后声明一个字符 ( director_name)。教训,'*'指示指针间接级别的一元与变量而不是类型一起使用。为什么?正如你所经历的:

char* a, b, c;

不声明三个指向的指针char,它声明一个指针和两个char变量。使用:

char *a, b, c;

说明了这一点。

多行阅读

当您必须协调来自多条线路的数据时,您必须先验证您是否获得了组中每条线路的所需信息,然后才能认为该组的输入有效。有许多方法,但也许更直接的一种方法是简单地使用临时变量来保存每个输入,并保持一个计数器,每次接收到成功的输入时递增。如果您填充了所有临时变量,并且您的计数器反映了正确的输入数量,那么您可以为每个结构分配内存并将临时变量复制到它们的永久存储中。然后,您将计数器重置为零,然后重复,直到文件中的行数用完。

您的大多数读取都是直截了当的,除了month在给定月份被读取为小写字符串,然后您必须将其转换int为存储在您的struct date. 可能最简单的处理方法是创建一个查找表(例如,在十二个月的每个月中,一个指向字符串文字的指针的常量数组)。然后在阅读了您的月份字符串后,您可以循环使用该数组将strcmp()那个月份的索引映射到您的 stuct。(添加+1到 make,例如januarymonth 1februarymonth2等...)例如,您可以使用以下内容:

const char *months[] = { "january", "february", "march", "april",
                        "may", "june", "july", "august", "september",
                        "october", "november", "december" };
#define NMONTHS (int)(sizeof months / sizeof *months)

其中宏NMONTHS用于.12months

然后,为了读取您的文件,您的基本方法是读取每一行,fgets()然后解析该行中所需的信息,并在此过程中sscanf() 验证每个输入、转换和分配。验证是任何一段代码成功的关键,对于多行读取和转换尤其重要。

例如,给定您的结构,您可以声明额外需要的常量并声明和初始化临时变量,然后打开作为第一个参数给出的文件并验证它是否打开以供阅读:

...
#define MAXC 1024       /* if you need a constant, #define one (or more) */
#define MAXN  128
#define AVAIL   2
...
int main (int argc, char **argv) {
    
    char line[MAXC], tmptitle[MAXN], tmpsurnm[MAXN], tmpnm[MAXN], tmpmo[MAXN];
    int good = 0, tmpid;
    date tmpdt = { .day = 0 };      /* temporary date struct to fill */
    movie *movies = NULL;
    size_t avail = AVAIL, used = 0;
    /* use filename provided as 1st argument (stdin by default) */
    FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
    
    if (!fp) {  /* validate file open for reading */
        perror ("file open failed");
        return 1;
    }

您的变量上方good将是您的计数器,您会为构成输入块的七行数据中的每一行的数据的每次良好读取和转换而递增。当good == 7您确认您拥有与一部电影相关的所有数据时,您可以使用所有临时值分配和填充最终存储。

usedavail计数器跟踪您分配的struct movie可用数量以及使用的数量。什么时候used == avail,您知道是时候为realloc()您的电影块添加更多内容了。这就是动态分配方案的工作原理。您分配了一些您需要的预期数量的对象。你循环读取和填充对象,直到你填满你分配的东西,然后你重新分配更多并继续前进。

您可以每次添加任意数量的额外内存,但一般方案是每次需要重新分配时将分配的内存加倍。这在所需的分配数量和可用对象数量的增长之间提供了良好的平衡。

(内存操作相对昂贵,您希望避免为每个新的外部结构分配 - 尽管分配在扩展而不是每次都创建新的和复制方面有所改善,但使用分配更大块的方案仍然会更有效最后接近)

现在声明了临时变量和计数器,您可以开始多行读取。我们以第一id行为例:

    while (fgets (line, MAXC, fp)) {    /* read each line */
        /* read ID line & validate conversion */
        if (good == 0 && sscanf (line, "Movie id: %d", &tmpid) == 1)
            good++;     /* increment good line counter */

您阅读该行并检查是否good == 0将阅读与该id行协调。您尝试转换int并验证两者。如果您成功地将一个整数存储在您的临时 id 中,则增加您的good计数器。

您对 Title 行的阅读将是相似的,只是这次它将是 anelse if而不是 plain if。上面的id行和下一行的读取title将是:

     while (fgets (line, MAXC, fp)) {    /* read each line */
        /* read ID line & validate conversion */
        if (good == 0 && sscanf (line, "Movie id: %d", &tmpid) == 1)
            good++;     /* increment good line counter */
        /* read Title line & validate converion */
        else if (good == 1 && sscanf (line, "title:%127[^\n]", tmptitle) == 1)
            good++;     /* increment good line counter */

注意:scanf()任何时候你用任何函数家族将字符串读入任何数组,你必须使用字段宽度修饰符(127上面)将读取限制为你的数组可以容纳的内容(+1 for '\0')以保护你的数组边界不会被覆盖。如果你没有包含field-width修饰符,那么使用该scanf()函数并不比使用更安全gets()。请参阅:为什么 gets() 如此危险,不应该使用它!

读取每一行并成功转换和存储后,good将递增以将下一行的值读取到适当的临时变量中。

注意我说你有更多的工作要做month读取和转换由于读取,例如"september",但需要将整数存储9在你的结构中。从一开始就使用查找表,您将读取并获取月份名称的字符串,然后循环查找查找表中的索引(您将希望添加+1到索引中january == 1,以此类推)。你可以这样做:

        /* read Month line and loop comparing with array to map index */
        else if (good == 5 && sscanf (line, "month: %s", tmpmo) == 1) {
            tmpdt.month = -1;   /* set month -1 as flag to test if tmpmo found */
            for (int i = 0; i < NMONTHS; i++) {
                if (strcmp (tmpmo, months[i]) == 0) {
                    tmpdt.month = i + 1;    /* add 1 to make january == 1, etc... */
                    break;
                }
            }
            if (tmpdt.month > -1)   /* if not found, flag still -1 - failed */
                good++;
            else
                good = 0;
        }

在你最后一个之后else ifyear你包含一个else,以便块中任何一行的任何失败都将重置good = 0;,因此它将尝试读取并匹配id文件中的下一行,例如

        /* read Year line & validate */
        else if (good == 6 && sscanf (line, "year: %d", &tmpdt.year) == 1)
            good++;
        else
            good = 0;

动态分配

嵌套结构的动态分配并不难,但你必须清楚你将如何处理它。您的外部结构struct movie是您将使用used == avail等分配和重新分配的结构......struct date每次struct director_info您的所有七个临时变量都已填充和验证并准备好放入最终存储时,您都必须进行分配。您将通过检查您的struct movie块是否已分配来开始您的分配块,如果没有分配它。如果它有,并且used == avail,你重新分配它。

现在,每次realloc()您使用临时指针时,因此当(不是如果)realloc()返回失败时NULL,您不会通过用返回的覆盖它来丢失指向当前分配的存储的指针NULL- 造成内存泄漏。为您分配或重新分配的初始处理struct movie如下所示:

        /* if good 7, all sequential lines and values for movie read */
        if (good == 7) {
            director_info *newinfo;     /* declare new member pointers */
            date *newdate;
            size_t len;
            
            /* if 1st allocation for movies, allocate AVAIL no. of movie struct */
            if (movies == NULL) {
                movies = malloc (avail * sizeof *movies);
                if (!movies) {                  /* validate EVERY allocation */
                    perror ("malloc-movies");
                    exit (EXIT_FAILURE);
                }
            }
            /* if movies needs reallocation */
            if (used == avail) {
                /* ALWAYS realloc with a temporary pointer */
                void *tmp = realloc (movies, 2 * avail * sizeof *movies);
                if (!tmp) {
                    perror ("realloc-movies");
                    break;
                }
                movies = tmp;
                avail *= 2;
            }

现在您有一个有效的块struct movie,您可以在其中直接存储id和分配,title并将分配的包含标题的块分配给title每个存储值中的指针struct movie。我们首先分配两个struct movie。当您开始时used == 0avail = 2请参阅AVAIL顶部的常量以了解 的2来源)。处理id和分配的title工作方式如下:

            movies[used].id = tmpid;    /* set movie ID to tmpid */
            
            /* get length of tmptitle, allocate, copy to movie title */
            len = strlen (tmptitle);
            if (!(movies[used].title = malloc (len + 1))) {
                perror ("malloc-movies[used].title");
                break;
            }
            memcpy (movies[used].title, tmptitle, len + 1);

(注意:当您在一块内存中声明多个结构并用于[..]索引每个结构时,[..]充当指针的取消引用,因此您使用'.'运算符来访问 后面的成员[..],而不是'->'像通常那样取消引用的运算符用于访问成员的结构指针(取消引用已由 完成[..]

此外,由于您知道len,因此没有理由使用strcpy()复制tmptitlemovies[used].titlestrcpy()扫描字符串以查找末尾的 nul 终止字符。您已经知道字符数,因此只需用于memcpy()复制len + 1字节。(注意,如果你有strdup(),你可以在一次调用中分配和复制,但注意strdup()不是 C11 中 c 库的一部分。

struct director_info每个元素的分配struct movie是直截了当的。您分配struct director_info然后使用strlen()来获取名称的长度,然后为每个和分配存储空间,memcpy()就像我们上面所做的那样。

            /* allocate director_info struct & validate */
            if (!(newinfo = malloc (sizeof *newinfo))) {
                perror ("malloc-newinfo");
                break;
            }
            
            len = strlen (tmpsurnm);    /* get length of surname, allocate, copy */
            if (!(newinfo->director_surname = malloc (len + 1))) {
                perror ("malloc-newinfo->director_surname");
                break;
            }
            memcpy (newinfo->director_surname, tmpsurnm, len + 1);
            
            len = strlen (tmpnm);       /* get length of name, allocate, copy */
            if (!(newinfo->director_name = malloc (len + 1))) {
                perror ("malloc-newinfo->director_name");
                break;
            }
            memcpy (newinfo->director_name, tmpnm, len + 1);
            
            movies[used].director = newinfo;    /* assign allocated struct as member */

处理分配和填充新struct date的更加容易。您只需分配并分配 3 个整数值,然后将分配的地址分配struct date给您的指针struct movie,例如

            /* allocate new date struct & validate */
            if (!(newdate = malloc (sizeof *newdate))) {
                perror ("malloc-newdate");
                break;
            }
            
            newdate->day = tmpdt.day;       /* populate date struct from tmpdt struct */
            newdate->month = tmpdt.month;
            newdate->year = tmpdt.year;
                    
            movies[used++].release_date = newdate;  /* assign newdate as member */
            good = 0;
        }

就是这样,used++当您分配最后一个指针时,您会递增,struct movie因此您可以使用文件中的接下来的七行填充该块中的下一个元素。您重置good = 0;以准备读取循环以id从文件中读取下一行。

总而言之

如果您填写完整的代码,您最终会得到类似于:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAXC 1024       /* if you need a constant, #define one (or more) */
#define MAXN  128
#define AVAIL   2

const char *months[] = { "january", "february", "march", "april",
                        "may", "june", "july", "august", "september",
                        "october", "november", "december" };
#define NMONTHS (int)(sizeof months / sizeof *months)

typedef struct date {
      int day, month, year;
} date;

typedef struct director_info {
      char *director_surname, *director_name;
} director_info;

typedef struct movie {
  int id;
  char *title;
  director_info *director;
  date *release_date;
} movie;

void prnmovies (movie *movies, size_t n)
{
    for (size_t i = 0; i < n; i++)
        printf ("\nMovie ID : %4d\n"
                "Title    : %s\n"
                "Director : %s %s\n"
                "Released : %02d/%02d/%4d\n",
                movies[i].id, movies[i].title, 
                movies[i].director->director_name, movies[i].director->director_surname,
                movies[i].release_date->day, movies[i].release_date->month,
                movies[i].release_date->year);
}

void freemovies (movie *movies, size_t n)
{
    for (size_t i = 0; i < n; i++) {
        free (movies[i].title);
        free (movies[i].director->director_surname);
        free (movies[i].director->director_name);
        free (movies[i].director);
        free (movies[i].release_date);
    }
    free (movies);
}

int main (int argc, char **argv) {
    
    char line[MAXC], tmptitle[MAXN], tmpsurnm[MAXN], tmpnm[MAXN], tmpmo[MAXN];
    int good = 0, tmpid;
    date tmpdt = { .day = 0 };      /* temporary date struct to fill */
    movie *movies = NULL;
    size_t avail = AVAIL, used = 0;
    /* use filename provided as 1st argument (stdin by default) */
    FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
    
    if (!fp) {  /* validate file open for reading */
        perror ("file open failed");
        return 1;
    }
    
    while (fgets (line, MAXC, fp)) {    /* read each line */
        /* read ID line & validate conversion */
        if (good == 0 && sscanf (line, "Movie id: %d", &tmpid) == 1)
            good++;     /* increment good line counter */
        /* read Title line & validate converion */
        else if (good == 1 && sscanf (line, "title:%127[^\n]", tmptitle) == 1)
            good++;     /* increment good line counter */
        /* read director Surname line & validate */
        else if (good == 2 && sscanf (line, "surname of director: %127[^\n]", 
                                        tmpsurnm) == 1)
            good++;
        /* read directory Name line & validate */
        else if (good == 3 && sscanf (line, "name of director: %127[^\n]", tmpnm) == 1)
            good++;
        /* read Day line & validate */
        else if (good == 4 && sscanf (line, "date: %d", &tmpdt.day) == 1)
            good++;
        /* read Month line and loop comparing with array to map index */
        else if (good == 5 && sscanf (line, "month: %s", tmpmo) == 1) {
            tmpdt.month = -1;   /* set month -1 as flag to test if tmpmo found */
            for (int i = 0; i < NMONTHS; i++) {
                if (strcmp (tmpmo, months[i]) == 0) {
                    tmpdt.month = i + 1;    /* add 1 to make january == 1, etc... */
                    break;
                }
            }
            if (tmpdt.month > -1)   /* if not found, flag still -1 - failed */
                good++;
            else
                good = 0;
        }
        /* read Year line & validate */
        else if (good == 6 && sscanf (line, "year: %d", &tmpdt.year) == 1)
            good++;
        else
            good = 0;
        
        /* if good 7, all sequential lines and values for movie read */
        if (good == 7) {
            director_info *newinfo;     /* declare new member pointers */
            date *newdate;
            size_t len;
            
            /* if 1st allocation for movies, allocate AVAIL no. of movie struct */
            if (movies == NULL) {
                movies = malloc (avail * sizeof *movies);
                if (!movies) {                  /* validate EVERY allocation */
                    perror ("malloc-movies");
                    exit (EXIT_FAILURE);
                }
            }
            /* if movies needs reallocation */
            if (used == avail) {
                /* ALWAYS realloc with a temporary pointer */
                void *tmp = realloc (movies, 2 * avail * sizeof *movies);
                if (!tmp) {
                    perror ("realloc-movies");
                    break;
                }
                movies = tmp;
                avail *= 2;
            }
            
            movies[used].id = tmpid;    /* set movie ID to tmpid */
            
            /* get length of tmptitle, allocate, copy to movie title */
            len = strlen (tmptitle);
            if (!(movies[used].title = malloc (len + 1))) {
                perror ("malloc-movies[used].title");
                break;
            }
            memcpy (movies[used].title, tmptitle, len + 1);
            
            
            /* allocate director_info struct & validate */
            if (!(newinfo = malloc (sizeof *newinfo))) {
                perror ("malloc-newinfo");
                break;
            }
            
            len = strlen (tmpsurnm);    /* get length of surname, allocate, copy */
            if (!(newinfo->director_surname = malloc (len + 1))) {
                perror ("malloc-newinfo->director_surname");
                break;
            }
            memcpy (newinfo->director_surname, tmpsurnm, len + 1);
            
            len = strlen (tmpnm);       /* get length of name, allocate, copy */
            if (!(newinfo->director_name = malloc (len + 1))) {
                perror ("malloc-newinfo->director_name");
                break;
            }
            memcpy (newinfo->director_name, tmpnm, len + 1);
            
            movies[used].director = newinfo;    /* assign allocated struct as member */
            
            /* allocate new date struct & validate */
            if (!(newdate = malloc (sizeof *newdate))) {
                perror ("malloc-newdate");
                break;
            }
            
            newdate->day = tmpdt.day;       /* populate date struct from tmpdt struct */
            newdate->month = tmpdt.month;
            newdate->year = tmpdt.year;
                    
            movies[used++].release_date = newdate;  /* assign newdate as member */
            good = 0;
        }
        
    }
    if (fp != stdin)   /* close file if not stdin */
        fclose (fp);

    prnmovies (movies, used);       /* print stored movies */
    freemovies (movies, used);      /* free all allocated memory */
}

(注意:添加prnmovies()输出所有存储的电影并freemovies()释放所有分配的内存)

示例输入文件

而不是一部电影只有一个七行的块,让我们添加另一个以确保代码将循环通过一个文件,例如

$ cat dat/moviegroups.txt
Movie id:1448
title:The movie
surname of director: lorez
name of director: john
date: 3
month: september
year: 1997
Movie id:1451
title:Election - Is the Cheeto Tossed?
surname of director: loreza
name of director: jill
date: 3
month: november
year: 2020

示例使用/输出

使用文件名中包含两部电影的数据处理输入文件,dat/moviegroups.txt您将拥有:

$ ./bin/movieinfo dat/moviegroups.txt

Movie ID : 1448
Title    : The movie
Director : john lorez
Released : 03/09/1997

Movie ID : 1451
Title    : Election - Is the Cheeto Tossed?
Director : jill loreza
Released : 03/11/2020

内存使用/错误检查

在您编写的任何动态分配内存的代码中,对于分配的任何内存块,您有 2 个责任:(1)始终保留指向内存块起始地址的指针,(2)它可以在它不存在时被释放更需要。

您必须使用内存错误检查程序,以确保您不会尝试访问内存或写入超出/超出分配块的范围,尝试读取或基于未初始化值的条件跳转,最后确认释放所有分配的内存。

对于 Linuxvalgrind是正常的选择。每个平台都有类似的内存检查器。它们都易于使用,只需通过它运行您的程序即可。

$ valgrind ./bin/movieinfo dat/moviegroups.txt
==9568== Memcheck, a memory error detector
==9568== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==9568== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==9568== Command: ./bin/movieinfo dat/moviegroups.txt
==9568==

Movie ID : 1448
Title    : The movie
Director : john lorez
Released : 03/08/1997

Movie ID : 1451
Title    : Election - Is the Cheeto Tossed?
Director : jill loreza
Released : 03/10/2020
==9568==
==9568== HEAP SUMMARY:
==9568==     in use at exit: 0 bytes in 0 blocks
==9568==   total heap usage: 14 allocs, 14 frees, 5,858 bytes allocated
==9568==
==9568== All heap blocks were freed -- no leaks are possible
==9568==
==9568== For counts of detected and suppressed errors, rerun with: -v
==9568== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

始终确认您已释放所有已分配的内存并且没有内存错误。

这个答案中有很多信息(结果总是比我预期的要长),但是要对正在发生的事情做出公平的解释需要一点时间。慢慢来,了解每一段代码在做什么,并了解分配是如何处理的(这需要时间来消化)。如果您遇到困难,请发表评论,我很乐意进一步解释。

于 2020-11-03T15:34:40.670 回答