Comparing unsigned char and EOF

2019-01-07 21:47发布

问题:

when the following code is compiled it goes into an infinite loop:

int main()
{
    unsigned char  ch;
    FILE *fp;
    fp = fopen("abc","r");
    if(fp==NULL)
    {
        printf("Unable to Open");
        exit(1);
    }
    while((ch = fgetc(fp))!=EOF)
    printf("%c",ch);
    fclose(fp);
    printf("\n",ch);
    return 0;
}

The gcc Compiler also gives warning on compilation

abc.c:13:warning: comparison is always true due to limited range of data type

the code runs fine when unsigned char is replaced by char or int as expected i.e. it terminates.
But the code also runs fine for unsigned int as well. as i have i have read in EOF is defines as -1 in stdio.h then why does this code fails for unsigned char but runs fine for unsigned int.

回答1:

The golden rule for writing this line is

   while ((ch = fgetc(stdin)) != EOF)

ch should be int .Your cute trick of making ch unsigned fails because EOF is a signed int quantity.

Ok, let's now go into the depth......

Step 1:

ch=fgetc(fp)

fgetc() returns -1 (a signed int). By the golden rules of C ch gets the last octet of bits which is all 1's. And hence the value 255. The byte pattern of ch after the execution of

ch = fgetc(fp); 

would thus be

11111111

Step 2:

ch != EOF

Now EOF is a signed integer and ch is an unsigned char ...

Again I refer to the golden rule of C ... the smaller guy ch is converted to big size int before comparision so its byte pattern is now

00000000000000000000000011111111 = (255)10

while EOF is

11111111111111111111111111111111 = (-1)10

There is no way they can be equal....... Hence the statement to steer the following while-loop

while ((ch = fgetc(stdin)) != EOF)

will never evaluate to false ...

And hence the infinite loop .



回答2:

There are several implicit conversions going on. They aren't really relevant to the specific warning, but I included them in this answer to show what the compiler really does with that expression.

  • ch in your example is of type unsigned char.
  • EOF is guaranteed to be of type int (C99 7.19.1).

So the expression is equivalent to

(unsigned char)ch != (int)EOF

The integer promotion rules in C will implicitly convert the unsigned char to unsigned int:

(unsigned int)ch != (int)EOF

Then the balancing rules (aka the usual arithmetic conversions) in C will implicitly convert the int to unsigned int, because each operand must have the same type:

(unsigned int)ch != (unsigned int)EOF

On your compiler EOF is likely -1:

(unsigned int)ch != (unsigned int)-1

which, assuming 32-bit CPU, is the same as

(unsigned int)ch != 0xFFFFFFFFu

A character can never have such a high value, hence the warning.



回答3:

I have encountered this problem too. My solution is to use feof().

unsigned int xxFunc(){
  FILE *fin;
  unsigned char c;
  fin = fopen("...", "rb");
  if(feof(fin) != 0) return EOF;
  c = fgetc(fin);
  fclose(fin);
...
}

And you can define an int variable to compare with EOF. For example:

int flag = xxFunc();
while(flag != EOF) {...}

This works for me.

**IMPORTANT UPDATE***

After using the method I mentioned before, I found a serious problem. feof() is not a good way to break the while loop. Here is the reason for it. http://www.gidnetwork.com/b-58.html

So I find a better way to do this. I use an int variable to do it. here:

int flag;
unsigned char c;
while((flag = fgetc(fin)) != EOF) 
{ 
  //so, you are using flag to receive, but transfer the value to c later.
  c = flag;
  ... 
}

After my test, this works.



回答4:

you need to use an int

fgetc() returns an int specifically so that it can indicate the end of file

it runs fine with signed char because EOF (-1) is in the range, but it would break if you read in a char with value greater than 127.

Use an int, cast it to a char after you've checked for EOF



回答5:

When you compare an unsigned int with a signed int, it converts the signed int to unsigned int and compares them. Hence when you are reading the file with an unsigned int 'ch', reading an EOF gives you 2^32+1 (on a 4 byte int machine) and when comparing it with EOF, it converts EOF to unsigned which is also 2^32+1 and hence the program stops!

If you use unsigned char ch, when you read the file, reading EOF returns 2^32+1, and this will be casted to unsigned char, which truncates the value to first 8 bits (on a 1 byte char machine) and gives you an output of 255. Hence you are comparing 255 and 2^32+1, causing an infinite loop.

The problem here is truncating before compare.

If you use

while((ch = fgetc(fp))!=(unsigned char)EOF)
    printf("%c",ch);

you program will run fine!



回答6:

a lint warning is produced with this kind of implementation

Comparing type 'char' with EOF

 // read the data in a buffer
611     ch = getc(csv_file);
612     while (ch != EOF)

FIX:

// read the data in a buffer
    while ((ch = getc(csv_file)) != EOF)