How to get alternative line from a file and store

2019-08-19 07:35发布

问题:

i have a file needed to be read by code. The file is shown as below. The very first line of the file contains a single integer, indicating the number of journal entries in the file. I need to write a C program to read a file and store the contents in a dynamically-allocated array of structs.

4
12/04/2010 
Interview went well I think, though was told to wear shoes. 
18/04/2010 
Doc advised me to concentrate on something... I forget. 
03/05/2010
Was asked today if I was an art exhibit. 
19/05/2010 
Apparently mudcakes not made of mud, or angry wasps.

I was able to strtok() the day, month and year to store in my struct, however I am stuck in saving the strings to my structs. Here is my code for strtok(),

FILE* file=fopen("struct.txt","r");
        if (file==NULL){
            perror("Error opening file\n.");}
            else {
                fscanf(file,"%d",&size);
                 res=(Diary*)malloc(size*sizeof(Diary));
                 fscanf(file,"%*[^\n]");
while(fgets(day,1024,file)!= NULL){
    oken=strtok(day,"/");
    h[i]=atoi(oken);          */h[i] is my day
    oken=strtok(NULL,"/");
    fre[i]=atoi(oken);        */fre[i] is the month
    oken=strtok(NULL,"/");
    re[i]=atoi(oken);          */ re[i] is the year
    okena=strtok(day,"\n");
    strcpy(rese[i],okena);    */i had declared rese as char rese[1024]
    printf("%s",okena);
    i++;
   }

the program is not working with that strcpy(), when i run it, it keep crashing. However if i remove the strcpy(), it will print as follow:

12
Interview went well I think, though was told to wear shoes. 
18
Doc advised me to concentrate on something... I forget. 
03
Was asked today if I was an art exhibit. 
19
Apparently mudcakes not made of mud, or angry wasps.

which this is not the strings i want to store in my struct as well. I am stuck in how to store the strings into a struct. My struct is

typedef struct journal{
int day;
int month;
int year;
char entry[1024];
} Diary;

Any good souls could tell me what is wrong?

回答1:

the following proposed code:

  1. performs the desired functionality
  2. gives meaningful names to the 'magic' numbers
  3. separates the struct definition from the typedef for that struct
  4. updated/edited with latest question details

and now the proposed code:

#include <stdio.h>
#include <stdlib.h>

#define MAX_LINE_LEN 1024

struct journal
{
    int day;
    int month;
    int year;
    char entry[ MAX_LINE_LEN ];
};
typedef struct journal Diary;


int main( void )
{

    FILE* file=fopen("struct.txt","r");
    if ( !file )
    {
        perror("fopen failed");}
        exit( EXIT_FAILURE );
    }

    // implied else, fopen successful

    char line[ MAX_LINE_LEN ];
    int size;

    if( fgets( line, sizeof line, file ) )
    {
        if ( sscanf( line, "%d", size ) != 1 )
        {
            fprintf( stderr, "scanf for data count failed\m" );
            exit( EXIT_FAILURE );
        }

        // implied else, input of data count successful
    }

    else
    {
        perror( "fgets for data count failed" );
        exit( EXIT_FAILURE );
    }

    // implied else, fgets successful


    Diary myDiary[ size ];  // uses VLA (variable length array feature of C


    size_t i = 0;
    char *token = NULL;

    while( i < size && fgets( line, sizeof( line ), file) )
    {
        token = strtok( line, "/" );
        if( token )
        {
            myDiary[i].day = atoi( token );

            token = strtok( NULL, "/" );
            if( token )
            {
                myDiary[i].month = atoi( token );

                token = strtok( NULL, "/" );
                if( token )
                {
                    myDiary[i].year = atoi( token );

                    // input data directly into struct instance
                    fgets( myDiary[i].entry, MAX_LINE_LEN, file );
                }
            }
        }
        i++;
    }
}


回答2:

Your problem presents the classic problem of "How do I read and allocate for X number of something when I don't know how many beforehand?" This is actually a simpler variant of the question, because you can read the X number as the first line from your data file.

(which simplifies the problem to a single allocation of X structs after reading the first line - otherwise you would need to keep track of the current number of structs allocated and realloc as required)

To begin, I would recommend against creating char entry[1024]; within your struct for two reasons - first, the automatic storage for entry is created on the stack and a large diary could easily StackOverflow... Second, it's just wasteful. If the goal is dynamic allocation, then allocate only the storage necessary for each entry. You can declare a single buffer of 1024 chars to use as a read buffer, but then allocate only strlen (buf) + 1 char to hold the entry (after trimming the included '\n' from the entry).

The remainder of your problem, is the basis for any reliable code, is simply to validate each read, each parse, and each allocation so you insure you are processing valid data and have valid storage throughout your code. This applies to every pieces of code you write, not just this problem.

Putting those pieces together, and providing further details inline in the comments below, you could do something like the following:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct journal {
    int day,
        month,
        year;
    char *entry;    /* declare a pointer, allocate only need No. of chars */
} diary_t;

#define MAXLENGTH 1024  /* max read buf for diary entry */

int main (int argc, char **argv) {

    size_t entries = 0, i, n = 0;
    char buf[MAXLENGTH] = "";
    diary_t *diary = NULL;
    FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;

    if (!fp) {  /* validate file open for reading */
        fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
        return 1;
    }

    /* read first line, parse number of entries */
    if (!(fgets (buf, MAXLENGTH, fp)) ||        /* validate read */
        sscanf (buf, "%zu", &entries) != 1) {   /* validate conversion */
        fputs ("error: failed to read 1st line.\n", stderr);
        return 1;
    }

    /* allocate/validate entries number of diary_t */
    if (!(diary = calloc (entries, sizeof *diary))) {
        perror ("calloc-diary_pointers");
        return 1;
    }

    for (i = 0; i < entries; i++) { /* loop No. entries times */

        size_t len = 0;
        if (!fgets (buf, MAXLENGTH, fp)) {  /* read/validate date */
            fprintf (stderr, "error: failed to read date %zu.\n", i);
            return 1;
        }
        if (sscanf (buf, "%d/%d/%d",    /* parse into day, month, year */
                    &diary[i].day, &diary[i].month, &diary[i].year) != 3) {
            fprintf (stderr, "error failed to parse date %zu.\n", i);
            return 1;
        }

        if (!fgets (buf, MAXLENGTH, fp)) {  /* read entry */
            fprintf (stderr, "error: failed to read entry %zu.\n", i);
            return 1;
        }

        len = strlen (buf);                 /* get length */
        if (len && buf[len - 1] == '\n')    /* check last char is '\n' */
            buf[--len] = 0;                 /* overwrite with nul-character */
        else if (len == MAXLENGTH - 1) {    /* check entry too long */
            fprintf (stderr, "error: entry %zu exceeds MAXLENGTH.\n", i);
            return 1;
        }

        /* allocate/validate memory for entry */
        if (!(diary[i].entry = malloc ((len + 1)))) {
            perror ("malloc-diary_entry");
            fprintf (stderr, "error: memory exausted, entry[%zu].\n", i);
            break;  /* out of memory error, don't exit, just break */
        }
        strcpy (diary[i].entry, buf);   /* copy buf to entry */

        n++;    /* increment successful entry read */
    }
    if (fp != stdin) fclose (fp);   /* close file if not stdin */

    for (i = 0; i < n; i++) {   /* output diary entries */
        printf ("entry[%2zu]:  %2d/%2d/%4d - %s\n", i, diary[i].day,
                diary[i].month, diary[i].year, diary[i].entry);
        free (diary[i].entry);  /* don't forget to free entries */
    }
    free (diary);   /* don't forget to free diary */

    return 0;
}

(note: you can further simplify the code by using POSIX getline() for your read instead of a fixed buf and you can simplify the allocation and copy of each entry into your struct using strdup(), but neither are guaranteed available to all compilers -- use them if your compiler supports them and portability everywhere isn't a concern. Also note the GNU gcc uses %zu as the format specifier for size_t. If you are on windoze, change each to %lu)

Example Input File

$ cat dat/diary.txt
4
12/04/2010
Interview went well I think, though was told to wear shoes.
18/04/2010
Doc advised me to concentrate on something... I forget.
03/05/2010
Was asked today if I was an art exhibit.
19/05/2010
Apparently mudcakes not made of mud, or angry wasps.

Example Use/Output

$ ./bin/diary <dat/diary.txt
entry[ 0]:  12/ 4/2010 - Interview went well I think, though was told to wear shoes.
entry[ 1]:  18/ 4/2010 - Doc advised me to concentrate on something... I forget.
entry[ 2]:   3/ 5/2010 - Was asked today if I was an art exhibit.
entry[ 3]:  19/ 5/2010 - Apparently mudcakes not made of mud, or angry wasps.

Memory Use/Error Check

In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.

It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.

For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.

$ valgrind ./bin/diary <dat/diary.txt
==6403== Memcheck, a memory error detector
==6403== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==6403== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==6403== Command: ./bin/diary
==6403==
entry[ 0]:  12/ 4/2010 - Interview went well I think, though was told to wear shoes.
entry[ 1]:  18/ 4/2010 - Doc advised me to concentrate on something... I forget.
entry[ 2]:   3/ 5/2010 - Was asked today if I was an art exhibit.
entry[ 3]:  19/ 5/2010 - Apparently mudcakes not made of mud, or angry wasps.
==6403==
==6403== HEAP SUMMARY:
==6403==     in use at exit: 0 bytes in 0 blocks
==6403==   total heap usage: 5 allocs, 5 frees, 309 bytes allocated
==6403==
==6403== All heap blocks were freed -- no leaks are possible
==6403==
==6403== For counts of detected and suppressed errors, rerun with: -v
==6403== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

(note: the storage required for all diary entries (the entire diary) is only 309-bytes that is less than 1/10th the storage required declaring char entry[1024];)

Always confirm that you have freed all memory you have allocated and that there are no memory errors.


MS Windows

Since you seem to be having problems on windows, the following is the code above, with nothing but %lu substituted for %zu (as windows treats %zu as a literal), compiled on Win7 with an older version of VS compiler:

> cl /?
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.30319.01 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

Compile

> cl /nologo /Wall /wd4706 /wd4996 /Ox /Foobj/diary /Febin/diary /Tc diary.c

(note: I put my .obj files in a subdirectory ./obj and my binary executables in ./bin to keep my source directory clean. That is the purpose of /Foobj/diary and /Febin/diary above)

Example Use/Output

> bin\diary.exe dat\diary.txt
entry[ 0]:  12/ 4/2010 - Interview went well I think, though was told to wear shoes.
entry[ 1]:  18/ 4/2010 - Doc advised me to concentrate on something... I forget.
entry[ 2]:   3/ 5/2010 - Was asked today if I was an art exhibit.
entry[ 3]:  19/ 5/2010 - Apparently mudcakes not made of mud, or angry wasps.

You must insure you change, each and every %zu to %lu or you cannot expect proper output. You say you have changed all to int, but the snippet you posted in the comments below contains %zu -- this will not work on windows.

Look things over and let me know if you have further questions.