Word count program - stdin

2019-08-05 16:08发布

问题:

For below question,

Write a program to read English text to end-of-data (type control-D to indicate end of data at a terminal, see below for detecting it), and print a count of word lengths, i.e. the total number of words of length 1 which occurred, the number of length 2, and so on.

Define a word to be a sequence of alphabetic characters. You should allow for word lengths up to 25 letters.

Typical output should be like this:

        length 1 : 10 occurrences
        length 2 : 19 occurrences
  length 3 : 127 occurrences
     length 4 : 0 occurrences
        length 5 : 18 occurrences
        ....

To read characters to end of data see above question.


Here is my working solution,

#include<stdio.h>
int main(void){
  char ch;
  short wordCount[20] = {0};
  int count = 0;
  while(ch = getchar(), ch >= 0){
    if(ch == ' ' || ch == ',' || ch == ';'|| ch == ':'|| ch == '.'|| ch == '/'){
      wordCount[count]++;
      count=0;
    }else{
      count++;
    }
  }
  wordCount[count]++; // Incrementing here looks weird to me

  for(short i=1; i< sizeof(wordCount)/sizeof(short); i++){
    printf("\nlength %d : %d occurences",i, wordCount[i]);
  }
}

Question:

1)

From code elegance aspect, Can I avoid incrementing(++) wordCount outside while loop?

2)

Can I make wordCount array size more dynamic based on word size, rather than constant size 20?

Note: Learnt about struct but am yet to learn dynamic structures like Linkedlist

回答1:

For the dynamic allocations you can start with space for 20 shorts (although the problem statement appears to ask for you to allow for words up to 25 characters):

short maxWord = 20;
short *wordCount = malloc(sizeof(*wordCount) * maxWord);

Then, when you increment count you can allocate more space if the current word is longer than can be counted in your dynamic array:

} else {
    count++;
    if (count >= maxWord) {
        maxWord++;
        wordCount = realloc(sizeof(*wordCount) * maxWord);
    }
}

Don't forget to free(wordCount) when you are done.

Since you don't need to count zero-length words, you might consider modifying your code so that wordCount[0] stores the number of words of length 1, and so on.



回答2:

To 1): maybe scan from one delimiting character to the next until you increment wordCount. Make EOF a delimiting character as well.

To 2) you can scan the file twice and then decide how much memory you need. Or you dynamically realloc whenever the more memory is needed. This is something the std::array class does internally for example.

Also you should think about what happens if there are two characters after one another. Right now you would count this as a word.



标签: c eof getchar