I am confused about CHAR_BIT in limits.h. I have read some articles saying the macro CHAR_BIT is there for portability. To use the macro not a magic number like 8 in code, this is reasonable. But limits.h is from glibc-headers and it's value is fixed as 8. If glibc-headers is installed on a system on which a byte has more than 8 bits (say 16 bits), is that wrong when compiling? A 'char' is assigned 8 bits or 16 bits?
And when I modified CHAR_BIT to 9 in limits.h, the following code still prints '8', how?
#include <stdio.h>
#include <limits.h>
int
main(int argc, char **argv)
{
printf("%d\n", CHAR_BIT);
return 0;
}
The following is supplementary:
I've read all replies so for, but still not clear. In practice, #include <limits.h>
and use CHAR_BIT, I can obey that. But that's another thing. Here I want to know why it appears that way, first it is a fixed value '8' in glibc /usr/include/limits.h, what happens when those systems which has 1 byte != 8 bits are installed with glibc; then I found the value '8' is not even the real value the code is using, so '8' means nothing there? Why put '8' there if the value is not used at all?
Thanks,
Diving into system header files can be a daunting and unpleasant experience. glibc header files can easily create a lot of confusion in your head, because they include other system header files under certain circumstances that override what has been defined so far.
In the case of
limits.h
, if you read the header file carefully, you will find that the definition forCHAR_BIT
is only used when you compile code without gcc, since this line:Is inside an
if
condition a few lines above:Thus, if you compile your code with gcc, which is most likely the case, this definition for
CHAR_BIT
will not be used. That's why you change it and your code still prints the old value. Scrolling down a little bit on the header file, you can find this for the case that you're using GCC:include_next
is a GCC extension. You can read about what it does in this question: Why would one use #include_next in a project?Short answer: it will search for the next header file with the name you specify (
limits.h
in this case), and it will include GCC's generatedlimits.h
. In my system, it happens to be/usr/lib/gcc/i486-linux-gnu/4.7/include-fixed/limits.h
.Consider the following program:
With this program, you can find the path for your system with the help of
gcc -E
, which outputs a special line for each file included (see http://gcc.gnu.org/onlinedocs/cpp/Preprocessor-Output.html)Because
#include <limits.h>
is on line 2 of this program, which I namedtest.c
, runninggcc -E test.c
allows me to find the real file that is being included:You can find this in that file:
Note the
undef
directive: it is needed to override any possible previous definitions. It is saying: "Forget whateverCHAR_BIT
was, this is the real thing".__CHAR_BIT__
is a gcc predefined constant. GCC's online documentation describes it in the following way:You can read its value with a simple program:
And then running
gcc -E code.c
. Note that you shouldn't use this directly, as gcc's manpage mentions.Obviously, if you change
CHAR_BIT
definition inside/usr/lib/gcc/i486-linux-gnu/4.7/include-fixed/limits.h
, or whatever the equivalent path is in your system, you will be able to see this change in your code. Consider this simple program:Changing
CHAR_BIT
definition in gcc'slimits.h
(that is, the file in/usr/lib/gcc/i486-linux-gnu/4.7/include-fixed/limits.h
) from__CHAR_BIT__
to 9 will make this code print 9. Again, you can stop the compilation process after preprocessing takes place; you can test it withgcc -E
.What if you're compiling code with a compiler other than gcc?
Well, then be it, default ANSI limits are assumed for standard 32-bit words. From paragraph 5.2.4.2.1 in ANSI C standard (sizes of integral types
<limits.h>
):POSIX mandates that a compliant platform have
CHAR_BIT == 8
.Of course, glibc's assumptions can go wrong for machines which do not have
CHAR_BIT == 8
, but note that you must be under an unsual architecture AND not use gcc AND your platform is not POSIX compliant. Not very likely.Remember, however, that "implementation defined" means that the compiler writer chooses what happens. Thus, even if you're not compiling with
gcc
, there is a chance that your compiler has some sort of__CHAR_BIT__
equivalent defined. Even though glibc will not use it, you can do a little research and use your compiler's definition directly. This is generally bad practice - you will be writing code that is geared towards a specific compiler.Keep in mind that you should never be messing with system header files. Very weird things can happen when you compile stuff with wrong and important constants like
CHAR_BIT
. Do this for educational purposes only, and always restore the original file back.CHAR_BIT
should never be changed for a given system. The value ofCHAR_BIT
specifies size in bits of the smallest addressable unit of storage (a "byte") -- so even a system that uses 16-bit characters (UCS-2 or UTF-16) will most likely haveCHAR_BIT == 8
.Almost all modern systems have
CHAR_BIT == 8
; C implementations for some DSPs might set it to 16 or 32.The value of
CHAR_BIT
doesn't control the number of bits in a byte, it documents it, and allows user code to refer to it. For example, the number of bits in an object issizeof object * CHAR_BIT
.If you edit your system's
<limits.h>
file, that doesn't change the actual characteristics of the system; it just gives you an inconsistent system. It's like hacking your compiler so it defines the symbol_win32
rather than_linux
; that doesn't magically change your system from Windows to Linux, it just breaks it.CHAR_BIT
is a read-only constant for each system. It's defined by the developers of the system. You don't get to change it; don't even try.As far as I know, glibc only works on systems with 8-bit bytes. It's theoretically possible to modify it so it works on other systems, but without a lot of development work you probably wouldn't even be able to install it on a system with 16-bit bytes.
As for why hacking the
limits.h
file didn't change the value you got forCHAR_BIT
, system headers are complicated, and not intended to be edited in place. When I compile a small file that just has#include <limits.h>
on my system, it directly or indirectly includes:Two of these files have
#define
directives forCHAR_BIT
, one setting it to8
and another to__CHAR_BIT__
. I don't know (and I don't need to care) which of those definitions actually takes effect. All I need to know is that#include <limits.h>
will give the a correct definition forCHAR_BIT
-- as long as I don't do anything that corrupts the system.The whole point is that when compiling for a system with a different size, CHAR_BIT gets changed to the correct size.