why is char's sign-ness not defined in C?

2019-01-15 12:36发布

The C standard states:

ISO/IEC 9899:1999, 6.2.5.15 (p. 49)

The three types char, signed char, and unsigned char are collectively called the character types. The implementation shall define char to have the same range, representation, and behavior as either signed char or unsigned char.

And indeed gcc define that according to target platform.

My question is, why does the standard do that? I can see nothing that can come out of ambiguous type definition, except of hideous and hard to spot bugs.

More than so, in ANSI C (before C99), the only byte-sized type is char, so using char for math is sometimes inevitable. So saying "one should never use char for math" is not so true. If that was the case, a saner decision was to include three types "char,ubyte,sbyte".

Is there a reason for that, or is it just some weird backwards-compatibility gotcha, in order to allow bad (but common) compilers to be defined as standard compatible?

5条回答
劳资没心,怎么记你
2楼-- · 2019-01-15 12:43

"Plain" char having unspecified signed-ness allows compilers to select whichever representation is more efficient for the target architecture: on some architectures, zero extending a one-byte value to the size of "int" requires less operations (thus making plain char 'unsigned'), while on others the instruction set makes sign-extending more natural, and plain char gets implemented as signed.

查看更多
The star\"
3楼-- · 2019-01-15 12:50

On some machines, a signed char would be too small to hold all the characters in the C character set (letters, digits, standard punctuation, etc.) On such machines, 'char' must be unsigned. On other machines, an unsigned char can hold values larger than a signed int (since char and int are the same size). On those machines, 'char' must be signed.

查看更多
小情绪 Triste *
4楼-- · 2019-01-15 12:55

I suppose (out of the top of my head) that their thinking was along the following lines:

If you care about the sign of char (using it as a byte) you should explicitly choose signed or unsigned char.

查看更多
我只想做你的唯一
5楼-- · 2019-01-15 13:02

Perhaps historically some implementations' "char" were signed and some were unsigned, and so to be compatible with both they couldn't define it as one or the other.

查看更多
Fickle 薄情
6楼-- · 2019-01-15 13:05

in those good old days C was defined, the character world was 7bit, so the sign-bit could be used for other things (like EOF)

查看更多
登录 后发表回答