This question was asked to me in an interview, that size of char
is 2 bytes in some OS, but in some operating system it is 4 bytes or different.
Why is that so?
Why is it different from other fundamental types, such as int
?
This question was asked to me in an interview, that size of char
is 2 bytes in some OS, but in some operating system it is 4 bytes or different.
Why is that so?
Why is it different from other fundamental types, such as int
?
That was probably a trick question. The
sizeof(char)
is always 1.If the size differs, it's probably because of a non-conforming compiler, in which case the question should be about the compiler itself, not about the C or C++ language.
5.3.3 Sizeof [expr.sizeof]
The sizeof of other types than the ones pointed out are implementation-defined, and they vary for various reasons. An
int
has better range if it's represented in 64 bits instead of 32, but it's also more efficient as 32 bits on a 32-bit architecture.The physical sizes (in terms of the number of bits) of types are usually dictated by the target hardware.
For example, some CPUs can access memory only in units not smaller than 16-bit. For the best performance,
char
can then be defined a 16-bit integer. If you want 8-bit chars on this CPU, the compiler has to generate extra code for packing and unpacking of 8-bit values into and from 16-bit memory cells. That extra packing/unpacking code will make your code bigger and slower.And that's not the end of it. If you subdivide 16-bit memory cells into 8-bit chars, you effectively introduce an extra bit in addresses/pointers. If normal addresses are 16-bit in the CPU, where do you stick this extra, 17th bit? There are two options:
The latter option can sometimes be practical. For example, if the entire address space is divided in halves, one of which is used by the kernel and the other by user applications, then application pointers will never use one bit in their addresses. You can use that bit to select an 8-bit byte in a 16-bit memory cell.
C was designed to run on as many different CPUs as possible. This is why the physical sizes of
char
,short
,int
,long
,long long
,void*
,void(*)()
,float
,double
,long double
,wchar_t
, etc can vary.Now, when we're talking about different physical sizes in different compilers producing code for the same CPU, this becomes more of an arbitrary choice. However, it may be not that arbitrary as it may seem. For example, many compilers for Windows define
int
=long
= 32 bits. They do that to avoid programmer's confusion when using Windows APIs, which expectINT
=LONG
= 32 bits. Definingint
andlong
as something else would contribute to bugs due to loss of programmer's attention. So, compilers have to follow suit in this case.And lastly, the C (and C++) standard operates with
chars
andbytes
. They are the same concept size-wise. But C's bytes aren't your typical 8-bit bytes, they can legally be bigger than that as explained earlier. To avoid confusion you may use the termoctet
, whose name implies the number 8. A number of protocols uses this word for this very purpose.