Read from csv file and separate into variable

2019-08-09 17:15发布

问题:

I'm trying to separate my input values into 2 different categories. The first array call teamname would hold the the team names and the second array would hold the score for that week. My input file is .csv with the code the way it is everything is stored in the as a string instead of 2 separate variables. Also I'm not to program savvy and am only familiar with the library.

#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>

#define FILEIN "data.csv"
#define FILEOUT "matrix.csv"

int main (void)
{
    double nfl[32][32], teamscore[32];
    char teamname[30];
    int n;
    FILE *filein_ptr;
    FILE *fileout_ptr;

    filein_ptr = fopen (FILEIN, "r");
    fileout_ptr = fopen (FILEOUT, "w");

    for (n = 1; n <= 32; n++) {
        fscanf (filein_ptr, "%s  %lf\n", &teamname, &teamscore[n]);
        fprintf (fileout_ptr, "%s    %f\n", teamname, teamscore);
    }

    fclose (filein_ptr);
    fclose (fileout_ptr);

    return 0;
}

I should say that the input file has the first column with team names and the second column with team scores. Any help would be great. Thanks! Here is a sample input file

  • Steelers,20
  • Patriots,25
  • Raiders,15
  • Chiefs,35

回答1:

In addition to changing &teamname to teamname, there are a few other considerations you may want to look at. The first being, always initialize your variables. While not required, this has a number of positive benefits. For numerical arrays, it initializes all elements preventing an accidental read from an uninitialized value. For character arrays, initializing to 0 insures that the first copy to the string (less than the total length) will be null-terminated and also prevents an attempted read from an uninitialized value. It's just good habit:

    double teamscore[MAXS] = {0.0};
    char teamname[30] = {0};
    int n = 0;

You have defined default values for your filein_ptr and fileout_ptr, you can do the same for your array sizes. That makes your code easier to maintain by providing a single value to update if your array size needs change.

Next, and this is rather a nit, but an important nit. main accept arguments, defined by standard as int argc, char **argv (you may also see an char **envp on Unix systems, you may seem them both written in equivalent form char *argv[] and char *envp[]). The point here is to use them to take arguments for your program so you are not stuck with just your hardcoded data.csv and matrix.csv filenames. You can use your hardcoded values and still provided the user the ability to enter filenames of his choice by using a simple ternary operator (e.g. test ? if true code : if false code;):

    FILE *filein_ptr = argc > 1 ? fopen (argv[1], "r") : fopen (FILEIN, "r");
    FILE *fileout_ptr = argc > 2 ? fopen (argv[2], "w") : fopen (FILEOUT, "w");

There, the test argc > 1 (meaning there is at least one argument given by the user), if true code open (argv[1], "r") (open the filename given as the argument for reading, and if false code fopen (FILEIN, "r") open your default if not filename given. The same holds true for your output file. (you must provide them in the correct order).

Then if you open a file, you must validate that the file is actually open before you attempt to read from it. While you can test the input and output separately to tell which one failed, you can also use a simple || condition to check if either open failed:

    if (!filein_ptr || ! fileout_ptr) {
        fprintf (stderr, "error: filein of fileout open failed.\n");
        return 1;
    }

Lastly, if you know the number of lines of data you need to read, an indexed for loop as you have is fine, but you will rarely know the number of lines in a data file before hand. Even if using a for loop, you still need to check the return of fscanf to verify that you actually had 2 valid conversion (and therefore got 2 values you were expecting). Checking the return also provides another benefit. It allows you to continue reading until you no longer get 2 valid conversions from fscanf. This provides an easy way to read an unknown number of values from a file. However, you do need to insure you do not try and read more values into your array than they will hold. e.g.:

    while (fscanf (filein_ptr, " %29[^,],%lf", teamname, &teamscore[n]) == 2) {
        fprintf (fileout_ptr, "%s    %f\n", teamname, teamscore[n++]);
        if (n == MAXS) {  /* check data doesn't exceed MAXS */
            fprintf (stderr, "warning: data exceeds MAXS.\n");
            break;
        }
    }

note: when using a format specifier that contains a character case (like "%[^,], ..."), be aware it will read and include leading and trailing whitespace in the conversion to string. So if your file has ' Steelers ,..', teamname will include the whitespace. You can fix the leading whitespace by including a space before the start of the conversion (like " %29[^,], ...") and also limit the number of characters that can be read by specifying a maximum field width. (a trailing whitespace in the case would be easier trimmed after the read)

Putting all the pieces together, you could make your code more flexible and reliable by taking arguments from the user, and validating your file and read operations:

#define _CRT_SECURE_NO_WARNINGS 1
#include <stdio.h>

#define FILEIN "data.csv"
#define FILEOUT "matrix.csv"
#define MAXS 32

int main (int argc, char **argv)
{
    /* double nfl[MAXS][MAXS] = {{0}}; */
    double teamscore[MAXS] = {0.0};
    char teamname[30] = {0};
    int n = 0;
    FILE *filein_ptr = argc > 1 ? fopen (argv[1], "r") : fopen (FILEIN, "r");
    FILE *fileout_ptr = argc > 2 ? fopen (argv[2], "w") : fopen (FILEOUT, "w");

    if (!filein_ptr || ! fileout_ptr) {
        fprintf (stderr, "error: filein of fileout open failed.\n");
        return 1;
    }

    while (fscanf (filein_ptr, " %29[^,],%lf", teamname, &teamscore[n]) == 2) {
        fprintf (fileout_ptr, "%s    %f\n", teamname, teamscore[n++]);
        if (n == MAXS) {  /* check data doesn't exceed MAXS */
            fprintf (stderr, "warning: data exceeds MAXS.\n");
            break;
        }
    }

    fclose (filein_ptr);
    fclose (fileout_ptr);

    return 0;
}

Test Input

$ cat ../dat/teams.txt
Steelers,   20
Patriots,25
    Raiders,    15
    Chiefs,35

note: the variations in leading whitespace and whitespace between values was intentional.

Use/Output

$ ./bin/teams ../dat/teams.txt teamsout.txt

$ cat teamsout.txt
Steelers    20.000000
Patriots    25.000000
Raiders    15.000000
Chiefs    35.000000

Let me know if you have further questions.



回答2:

If you are going to store the team names in an array you should declare a two dimensional array:

char team_names[N_OF_TEAMS][MAX_CHAR_IN_NAME];

Then, you declare the array for the score. You are using doubles to store the score, aren't them only integers?

double scores[N_OF_TEAMS];

To read those values you can use:

int read_name_and_score( char * fname, int m, char nn[][MAX_CHAR_IN_NAME], double * ss)
{
    FILE *pf;
    int count = 0;

    if (!fname) {
        prinf("Error, no file name.\n");
        return -1;
    }
    pf = fopen(fname,'r');
    if (!pf) {
        printf("An error occurred while opening file %s.\n",fname);
        return -2;
    }

    while ( count < m && fscanf(pf, "%[^,],%d\n", nn[count], &ss[count]) == 2 ) count++;

    if (!fclose(pf)) {
        printf("An error occurred while closing file %s.\n",fname);
    };
    return count;
}

You need the [^,] to stop scanf from reading the string when finds a , The main will be something like:

#define N_OF_TEAMS 32
#define MAX_CHAR_IN_NAME 30

int main(void) {
    char team_names[N_OF_TEAMS][MAX_CHAR_IN_NAME];
    double scores[N_OF_TEAMS];
    int n;

    n = read_name_and_score("data.csv",N_OF_TEAMS,team_names,scores);
    if ( n != N_OF_TEAMS) {
        printf("Error, not enough data was read.\n");
        /* It's up to you to decide what to do now */
    }

    /* do whatever you want with data */

    return 0;
}


标签: c file input