How to assign values dynamically to a struct

2019-08-27 16:54发布

问题:

I am stumped as to how to access and change the values of a struct. The program takes in some external files and tokenized each string and categorizes them in the following fields of climate info. The external files look something like this:

TDV format:

 TN     1424325600000   dn20t1kz0xrz    67.0    0.0  0.0     0.0    101872.0    262.5665
 TN     1422770400000   dn2dcstxsf5b    23.0    0.0  100.0   0.0    100576.0    277.8087
 TN     1422792000000   dn2sdp6pbb5b    96.0    0.0  100.0   0.0    100117.0    278.49207
 TN     1422748800000   dn2fjteh8e80    6.0     0.0  100.0   0.0    100661.0    278.28485
 TN     1423396800000   dn2k0y7ffcup    14.0    0.0  100.0   0.0    100176.0    282.02142

The columns are in order such that the first is for state code, the second for timestamp (in milliseconds since the Unix epoch), the third column is geohash string for the location (unused), the fourth is percentage humidity, the fifth is snow present (values 0.0 or 1.0), the sixth is percentage cloud cover, the seventh is number of lightning strikes, the eighth is pressure (units unknown, but the data is unused so it doesn't matter) and the ninth is surface temperature (measured in Kelvin). I do realize I must convert the timestamp and surface temperature, so I am not worried about that. I need to aggregate the data across a complete state (regardless of geohash), keeping track of the minimum and maximum temperatures and the time when they occurred, and counting the number of records for the state so that values can be averaged.

The output for a single state should look like this:

 * Opening file: data_tn.tdv
 * States found: TN
 * -- State: TN --
 * Number of Records: 17097
 * Average Humidity: 49.4%
 * Average Temperature: 58.3F
 * Max Temperature: 110.4F on Mon Aug  3 11:00:00 2015
 * Min Temperature: -11.1F on Fri Feb 20 04:00:00 2015
 * Lightning Strikes: 781
 * Records with Snow Cover: 107
 * Average Cloud Cover: 53.0%

However, there will be multiple states, each with its own data file to be processed.

As you can see, the first token will be assigned to the state code however I have no idea as to how to do this. I have tried numerous strcpy and numerous other methods to try to send token into their respective fields but none have worked.

     struct climate_info
        {
            char code[3];
            unsigned long num_records;
            unsigned long timestamp;
            char location[13];
            unsigned int humidity;
            int snow;
            unsigned int cover;
            int strikes;
            long double pressure;
            long double sum_temperature;
        };



struct stats
{
    char code[3];
    long long timestamp;
    double humidity;
    double snow;
    double cloud;
    double strikes;
    double sum_temperature;
}stats;



    void analyze_file(FILE *file, struct climate_info *states[], int num_states);
    void print_report(struct climate_info *states[], int num_states);

    int main(int argc, char *argv[])
    {
        /* TODO: fix this conditional. You should be able to read multiple files. */
        if (argc < 1 )
        {
            printf("Usage: %s tdv_file1 tdv_file2 ... tdv_fileN \n", argv[0]);
            return EXIT_FAILURE;
        }

        /* Let's create an array to store our state data in. As we know, there are
         * 50 US states. */
        struct climate_info *states[NUM_STATES] = { NULL };

        int i;
        for (i = 1; i < argc; ++i)
        {
            /* TODO: Open the file for reading */

            /* TODO: If the file doesn't exist, print an error message and move on
             * to the next file. */
            /* TODO: Analyze the file */
            /* analyze_file(file, states, NUM_STATES); */
            FILE *fp = fopen(argv[i], "r");
                if(fp == NULL)
                {
                    printf("Error opening file");
                    break;
                }
                 else if(fp)
                {
                 analyze_file(fp, states,NUM_STATES);
                }
             fclose(fp);
        }
        print_report(states, NUM_STATES);
        return 0;
    }

    void analyze_file(FILE *file, struct climate_info **states, int num_states)
    {
        const int line_sz = 100;
        char line[line_sz];
        int counter = 0;
        char *token;
        while (fgets(line, line_sz, file) != NULL)
        {
            /* TODO: We need to do a few things here:
             *
             *       * Tokenize the line.
             *       * Determine what state the line is for. This will be the state
             *         code, stored as our first token.
             *       * If our states array doesn't have a climate_info entry for
             *         this state, then we need to allocate memory for it and put it
             *         in the next open place in the array. Otherwise, we reuse the
             *         existing entry.
             *       * Update the climate_info structure as necessary.
             */
  struct climate_info *y = malloc(sizeof(struct climate_info)*num_states);
    token = strtok(line," \t");
    strcpy((y[counter]).code,token);
    counter++;
    printf("%s\n",token);
    while(token)
    {
        printf("token: %s\n", token);
        token = strtok(NULL, " \t");
    }
    printf("%d\n",counter);
        //free(states);
    }

    void print_report(struct climate_info *states[], int num_states)
    {
        printf("States found: ");
        int i;
        for (i = 0; i < num_states; ++i) {
            if (states[i] != NULL)
            {
                struct climate_info *info = states[i];
                printf("%s", info->code);
            }
        }
        printf("\n");

回答1:

The values read from the file shouldn't be assigned directly to elements of the structure. You need one set of variables (they could be in a structure, but it isn't necessary) to receive the data as it is read, with sscanf() doing the parsing and splitting up. You then validate that the state code is correct, that the time is plausible, and so on. Then you add the cumulative information into the 'statistics structure', which is related to but different from the struct climate_info you currently have. It doesn't need a geohash column, for example, or a pressure column, but does need a minimum temperature and a time when that was spotted, and a maximum temperature and the time when that was spotted. You accumulate the snow cover count, and the lightning strike count, and the humidity and cloud cover and current temperature. Then when you finish the file, you can average the temperature, humidity and cloud cover values, and you can print the aggregates.

Since you are sensibly using fgets() to read lines from the file (don't change that!), you should use sscanf() to parse the line. You need:

  • a state code (char state[3];),
  • a time value (long long millitime;),
  • a humidity value (double humidity;),
  • a 'snow present' value (double snow; since the format is a floating point number),
  • a 'cloud cover' value (double cloud;),
  • a lightning strikes value (double lightning),
  • and a temperature value (double temperature;).

You then read them using

if (sscanf(line, "%2[A-Z] %lld %*s %lf %lf %lf %lf %*lf %lf",
           state, &millitime, &humidity, &snow, &cloud, &lightning, &temperature) == 7)
{
    …validate data and report errors if appropriate…
    …stash values appropriately; increment the count…
}
else
{
    …report format error?… 
}

Note that the * in the formats suppresses the assignment; the column is read but ignored. The code checks that the pressure is a numeric column; it doesn't validate the geohash column beyond 'it must exist'. It would be possible to specify the size as an upper bound %*12s.

One of the many advantages of using fgets() and sscanf() is that you can report errors much more intelligibly — you can say "the state code was incorrect in line XXX:" and then print the line since you have it available still. Using fscanf(), you'd not be able to report so easily on the content of the line, which makes it harder for whoever is debugging the data.