I'm under the same impression as this answer, that size_t
is always guaranteed by the standard to be large enough to hold the largest possible type of a given system.
However, this code fails to compile on gcc/Mingw:
#include <stdint.h>
#include <stddef.h>
typedef uint8_t array_t [SIZE_MAX];
error: size of array 'array_t' is too large
Am I misunderstanding something in the standard here? Is size_t
allowed to be too large for a given implementation? Or is this another bug in Mingw?
EDIT: further research shows that
typedef uint8_t array_t [SIZE_MAX/2]; // does compile
typedef uint8_t array_t [SIZE_MAX/2+1]; // does not compile
Which happens to be the same as
#include <limits.h>
typedef uint8_t array_t [LLONG_MAX]; // does compile
typedef uint8_t array_t [LLONG_MAX+(size_t)1]; // does not compile
So I'm now inclined to believe that this is a bug in Mingw, because setting the maximum allowed size based on a signed integer type doesn't make any sense.
The limit SIZE_MAX / 2 comes from the definitions of size_t and ptrdiff_t on your implementation, which choose that the types ptrdiff_t and size_t have the same width.
C Standard mandates1 that type size_t is unsigned and type ptrdiff_t is signed.
The result of difference between two pointers, will always2 have the type ptrdiff_t. This means that, on your implementation, the size of the object must be limited to
PTRDIFF_MAX, otherwise a valid difference of two pointers could not be represented in type ptrdiff_t, leading to undefined behavior.
Thus the value SIZE_MAX / 2 equals the value PTRDIFF_MAX. If the implementation choose to have the maximum object size be SIZE_MAX, then the width of the type ptrdiff_t would have to be increased. But it is much easier to limit the maximum size of the object to SIZE_MAX / 2, then it is to have the type ptrdiff_t have a greater or equal positive range than that of type size_t.
Standard offers these3 comments4 on the topic.
(Quoted from ISO/IEC 9899:201x)
1 (7.19 Common definitions 2)
The types are
ptrdiff_t
which is the signed integer type of the result of subtracting two pointers;
size_t
which is the unsigned integer type of the result of the sizeof operator;
2 (6.5.6 Additive operators 9)
When two pointers are subtracted, both shall point to elements of the same array object,
or one past the last element of the array object; the result is the difference of the
subscripts of the two array elements. The size of the result is implementation-defined,
and its type (a signed integer type) is ptrdiff_t defined in the header.
If the result is not representable in an object of that type, the behavior is undefined.
3 (K.3.4 Integer types 3)
Extremely large object sizes are frequently a sign that an object’s size was calculated
incorrectly. For example, negative numbers appear as very large positive numbers when
converted to an unsigned type like size_t. Also, some implementations do not support
objects as large as the maximum value that can be represented by type size_t.
4 (K.3.4 Integer types 4)
For those reasons, it is sometimes beneficial to restrict the range of object sizes to detect
programming errors. For implementations targeting machines with large address spaces,
it is recommended that RSIZE_MAX be defined as the smaller of the size of the largest
object supported or (SIZE_MAX >> 1), even if this limit is smaller than the size of
some legitimate, but very large, objects. Implementations targeting machines with small
address spaces may wish to define RSIZE_MAX as SIZE_MAX, which means that there is no object size that is considered a runtime-constraint violation.
The of range size_t
is guaranteed to be sufficient to store the size of the largest object supported by the implementation. The reverse is not true: you are not guaranteed to be able to create an object whose size fills the entire range of size_t
.
Under such circumstances the question is: what does SIZE_MAX
stand for? The largest supported object size? Or the largest value representable in size_t
? The answer is: it is the latter, i.e. SIZE_MAX
is (size_t) -1
. You are not guaranteed to be able to create objects SIZE_MAX
bytes large.
The reason behind that is that in addition to size_t
, implementations must also provide ptrdiff_t
, which is intended (but not guaranteed) to store the difference between two pointers pointing into the same array object. Since type ptrdiff_t
is signed, the implementations are faced with the following choices:
Allow array objects of size SIZE_MAX
and make ptrdiff_t
wider than size_t
. It has to be wider by at least one bit. Such ptrdiff_t
can accommodate any difference between two pointers pointing into an array of size SIZE_MAX
or smaller.
Allow array objects of size SIZE_MAX
and use ptrdiff_t
of the same width as size_t
. Accept the fact that pointer subtraction can overflow and cause undefined behavior, if the pointers are farther than SIZE_MAX / 2
elements apart. The language specification does not prohibit this approach.
Use ptrdiff_t
of the same width as size_t
and restrict the maximum array object size by SIZE_MAX / 2
. Such ptrdiff_t
can accommodate any difference between two pointers pointing into an array of size SIZE_MAX / 2
or smaller.
You are simply dealing with an implementation that decided to follow the third approach.
It looks very much like implementation-specific behaviour.
I'm running here Mac OS, and with gcc 6.3.0 the biggest size I can compile your definition with is SIZE_MAX/2
; with SIZE_MAX/2 + 1
it does not compile anymore.
On the other side, witch clang 4.0.0 the biggest one is SIZE_MAX/8
, and SIZE_MAX/8 + 1
breaks.
Just reasoning from scratch, size_t
is a type that can hold the size of any object. The size of any object is limited by the width of the address bus (ignoring multiplexing and systems that can handle eg 32 and 64 bit code, call that "code width"). Anologous to MAX_INT
which is the largest integer value, SIZE_MAX
is the largest value of size_t
. Thus, an object of size SIZE_MAX
is all addressable memory. It s reasonable that an implementation flags that as an error, however, I agree that it is an error only in a case where an actual object is allocated, be it on the stack or in global memory. (A call to malloc
for that amount will fail anyway)