Scanning data from text file, that doesn't hav

2019-08-31 01:18发布

问题:

I have encountered a problem with my homework. I need to scan some data from a text file, to a struct. The text file looks like this.

012345678;danny;cohen;22;M;danny1993;123;1,2,4,8;Nice person
223325222;or;dan;25;M;ordan10;1234;3,5,6,7;Singer and dancer
203484758;shani;israel;25;F;shaninush;12345;4,5,6,7;Happy and cool girl
349950234;nadav;cohen;50;M;nd50;nadav;3,6,7,8;Engineer very smart
345656974;oshrit;hasson;30;F;osh321;111;3,4,5,7;Layer and a painter 

Each item of data to its matching variable. id = 012345678 first_name = danny etc...

Now I can't use fscanf because there is no spacing, and the fgets scanning all the line.

I found some solution with %[^;]s, but then I will need to write one block of code and, copy and past it 9 times for each item of data.

Is there any other option without changing the text file, that similar to the code I would write with fscanf, if there was spacing between each item of data?

************* UPDATE **************

Hey, First of all, thanks everyone for the help really appreciating. I didn't understand all your answers, but here something I did use.

Here's my code :

#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct
{
    char *idP, *firstNameP, *lastNameP;
    int age;
    char gender, *userNameP, *passwordP, hobbies, *descriptionP;
}user;


void main() {
    FILE *fileP;
    user temp;
    char test[99];

    temp.idP = (char *)malloc(99);
    temp.firstNameP = (char *)malloc(99);
    temp.lastNameP = (char *)malloc(99);
    temp.age = (int )malloc(4);
    temp.gender = (char )malloc(sizeof(char));
    temp.userNameP = (char *)malloc(99);

    fileP = fopen("input.txt", "r");
    fscanf(fileP, "%9[^;];%99[^;];%99[^;];%d;%c", temp.idP,temp.firstNameP,temp.lastNameP,&temp.age, temp.gender);
    printf("%s\n%s\n%s\n%d\n%c", temp.idP, temp.firstNameP, temp.lastNameP, temp.age, temp.gender);

    fgets(test, 60, fileP); // Just testing where it stop scanning
    printf("\n\n%s", test);

    fclose(fileP);
    getchar();
}

It all works well until I scan the int variable, right after that it doesn't scan anything, and I get an error.

Thanks a lot.

回答1:

As discussed in the comments, fscanf is probably the shortest option (although fgets followed by strtok, and manual parsing are viable options).

You need to use the %[^;] specifier for the string fields (meaning: a string of characters other than ;), with the fields separated by ; to consume the actual semicolons (which we specifically requested not to be consumed as part of the string field). The last field should be %[^\n] to consume up to the newline, since the input doesn't have a terminating semicolon.

You should also (always) limit the length of each string field read with a scanf family function to one less than the available space (the terminating NUL byte is the +1). So, for example, if the first field is at most 9 characters long, you would need char field1[10] and the format would be %9[^;].

It is usually a good idea to put a single space in the beginning of the format string to consume any whitespace (such as the previous newline).

And, of course you should check the return value of fscanf, e.g., if you have 9 fields as per the example, it should return 9.

So, the end result would be something like:

if (fscanf(file, " %9[^;];%99[^;];%99[^;];%d;%c;%99[^;];%d;%99[^;];%99[^\n]",
           s.field1, s.field2, s.field3, &s.field4, …, s.field9) != 9) {
    // error
    break;
}

(Alternatively, the field with numbers separated by commas could be read as four separate fields as %d,%d,%d,%d, in which case the count would go up to 12.)



回答2:

Here you have simple tokenizer. As I see you have more than one delimiter here (; & ,)

str - string to be tokenized

del - string containing delimiters (in your case ";," or ";" only)

allowempty - if true allows empty tokens if there are two or more consecutive delimiters

return value is a NULL terminated table of pointers to the tokens.

char **mystrtok(const char *str, const char *del, int allowempty)
{
  char **result = NULL;
  const char *end = str;
  size_t size = 0;
  int extrachar;

  while(*end)
  {
    if((extrachar = !!strchr(del, *end)) || !*(end + 1))
    {
        /* add temp variable and malloc / realloc checks */
        /* free allocated memory on error */
        if(!(!allowempty && !(end - str)))
        {
            extrachar = !extrachar * !*(end + 1);
            result = realloc(result, (++size + 1) * sizeof(*result));
            result[size] = NULL;
            result[size -1] = malloc(end - str + 1 + extrachar);
            strncpy(result[size -1], str, end - str + extrachar);
            result[size -1][end - str + extrachar] = 0;
        }
        str = end + 1;
    }
    end++;
  }
  return result;
}

To free the the memory allocated by the tokenizer:

void myfree(char **ptr)
{
    char **savedptr = ptr;
    while(*ptr)
    {
        free(*ptr++);
    }
    free(savedptr);
}

Function is simple but your can use any separators and any number of separators.



标签: c text scanf