Consider the following†:
size_t r = 0;
r--;
const bool result = (r == -1);
Does the comparison whose result initialises result
have well-defined behaviour?
And is its result true
, as I'd expect?
This Q&A was written because I was unsure of two factors in particular.
They may both be identified by use of the term "crucial[ly]" in my answer.
† This example is inspired by an approach for loop conditions when the counter is unsigned:
for (size_t r = m.size() - 1; r != -1; r--)
size_t r = 0;
r--;
const bool result = (r == -1);
Strictly speaking, the value of result
is implementation-defined. In practice, it's almost certain to be true
; I'd be surprised if there were an implementation where it's false
.
The value of r
after r--
is the value of SIZE_MAX
, a macro defined in <stddef.h>
/ <cstddef>
.
For the comparison r == -1
, the usual arithmetic conversions are performed on both operands. The first step in the usual arithmetic conversions is to apply the integral promotions to both operands.
r
is of type size_t
, an implementation-defined unsigned integer type. -1
is an expression of type int
.
On most systems, size_t
is at least as wide as int
. On such systems, the integral promotions cause the value of r
either to be converted to unsigned int
or to keep its existing type (the former can happen if size_t
has the same width as int
, but a lower conversion rank). Now the left operand (which is unsigned) has at least the rank of the right operand (which is signed). The right operand is converted to the type of the left operand. This conversion yields the same value as r
, and so the equality comparison yields true
.
That's the "normal" case.
Suppose we have an implementation where size_t
is 16 bits (let's say it's a typedef
for unsigned short
) and int
is 32 bits. So SIZE_MAX == 65535
and INT_MAX == 2147483647
. Or we could have a 32-bit size_t
and a 64-bit int
. I doubt that any such implementation exists, but nothing in the standard forbids it (see below).
Now the left side of the comparison has type size_t
and value 65535
. Since signed int
can represent all the values of type size_t
, the integral promotions convert the value to 65535
of type int
. Both side of the ==
operator have type int
, so the usual arithmetic conversions have nothing to do. The expression is equivalent to 65535 == -1
, which is clearly false
.
As I mentioned, this kind of thing is unlikely to happen with an expression of type size_t
-- but it can easily happen with narrower unsigned types. For example, if r
is declared as an unsigned short
, or an unsigned char
, or even a plain char
on a system where that type is signed, the value of result
will probably be false
. (I say probably because short
or even unsigned char
can have the same width as int
, in which case result
will be true
.)
In practice, you can avoid the potential problem by doing the conversion explicitly rather than relying on the implementation-defined usual arithmetic conversions:
const bool result = (r == (size_t)-1);
or
const bool result = (r == SIZE_MAX);
C++11 standard references:
- 5.10 [expr.eq] Equality operators
- 5.9 [expr.rel] Relational operators (specifies that the usual arithmetic conversions are performed)
- 5 [expr] Expressions, paragraph 9: Usual arithmetic conversions
- 4.5 [conv.prom] Integral promotions
- 18.2 [support.types]
size_t
18.2 paragraphs 6-7:
6 The type size_t
is an implementation-defined unsigned integer type
that is large enough to contain the size in bytes of any object.
7 [ Note: It is recommended that implementations choose types for
ptrdiff_t
and size_t
whose integer conversion ranks (4.13) are no
greater than that of signed long int
unless a larger size is
necessary to contain all the possible values. — end note ]
So there's no prohibition on making size_t
narrower than int
. I can almost plausibly imagine a system where int
is 64 bits, but no single object can be bigger than 232-1 bytes so size_t
is 32 bits.
Yes, and the result is what you would expect.
Let's break it down.
What is the value of r
at this point? Well, the underflow is well-defined and results in r
taking on its maximum value by the time the comparison is run. std::size_t
has no specific known bounds, but we can make reasonable assumptions about its range when compared to that of an int
:
std::size_t
is the unsigned integer type of the result of the sizeof operator. [..] std::size_t
can store the maximum size of a theoretically possible object of any type (including array).
And, just to get it out of the way, the expression -1
is unary -
applied to the literal 1
, and has type int
on any system:
[C++11: 2.14.2/2]:
The type of an integer literal is the first of the corresponding list in Table 6 in which its value can be represented. [..]
(I won't cite all the text that describes how applying unary -
to an int
results in an int
, but it does.)
It's more than reasonable to suggest that, on the majority of systems, an int
is not going to be able to hold std::numeric_limits<std::size_t>::max()
.
Now, what happens to those operands?
[C++11: 5.10/1]:
The ==
(equal to) and the !=
(not equal to) operators have the same semantic restrictions, conversions, and result type as the relational operators except for their lower precedence and truth-value result. [..]
[C++11: 5.9/2]:
The usual arithmetic conversions are performed on operands of arithmetic or enumeration type. [..]
Let's examine these "usual arithmetic conversions":
[C++11: 5/9]:
Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way. The purpose is to yield a common type, which is also the type of the result.
This pattern is called the usual arithmetic conversions, which are defined as follows:
- If either operand is of scoped enumeration type (7.2), no conversions are performed; if the other
operand does not have the same type, the expression is ill-formed.
- If either operand is of type
long double
, the other shall be converted to long double`.
- Otherwise, if either operand is
double
, the other shall be converted to double
.
- Otherwise, if either operand is
float
, the other shall be converted to float
.
- Otherwise, the integral promotions (4.5) shall be performed on both operands.59 Then the following rules shall be applied to the promoted operands:
- If both operands have the same type, no further conversion is needed.
- Otherwise, if both operands have signed integer types or both have unsigned integer types, the
operand with the type of lesser integer conversion rank shall be converted to the type of the
operand with greater rank.
- Otherwise, if the operand that has unsigned integer type has rank greater than or equal to the rank of the type of the other operand, the operand with signed integer type shall be converted to
the type of the operand with unsigned integer type.
- Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, the operand with unsigned integer type shall be converted to the type of the operand with signed integer type.
- Otherwise, both operands shall be converted to the unsigned integer type corresponding to the
type of the operand with signed integer type.
I've highlighted the passage that takes effect here and, as for why:
[C++11: 4.13/1]
: Every integer type has an integer conversion rank defined as follows
- [..]
- The rank of
long long int
shall be greater than the rank of long int
, which shall be greater than the rank of int
, which shall be greater than the rank of short int
, which shall be greater than the rank of signed char
.
- The rank of any unsigned integer type shall equal the rank of the corresponding signed integer type.
- [..]
All integral types, even the fixed-width ones, are composed of the standard integral types; therefore, logically, std::size_t
must be unsigned long long
, unsigned long
, or unsigned int
.
If std::size_t
is unsigned long long
, or unsigned long
, then the rank of std::size_t
is greater than the rank of unsigned int
and, therefore, also of int
.
If std::size_t
is unsigned int
, the rank of std::size_t
is equal to the rank of unsigned int
and, therefore, also of int
.
Either way, per the usual arithmetic conversions, the signed operand is converted to the type of the unsigned operand (and, crucially, not the other way around!). Now, what does this conversion entail?
[C++11: 4.7/2]:
If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). —end note ]
[C++11: 4.7/3]:
If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined.
This means that std::size_t(-1)
is equivalent to std::numeric_limits<std::size_t>::max()
; it's crucial that the value n in the above clause relates to the number of bits used to represent the unsigned type, not the source type. Otherwise, we'd be doing std::size_t((unsigned int)-1)
, which is not the same thing at all — it could be many orders of magnitude smaller than our desired value!
Indeed, now that we know the conversions are all well-defined, we can test this value:
std::cout << (std::size_t(-1) == std::numeric_limits<size_t>::max()) << '\n';
// "1"
And, just to illustrate my point from earlier, on my 64-bit system:
std::cout << std::is_same<unsigned long, std::size_t>::value << '\n';
std::cout << std::is_same<unsigned long, unsigned int>::value << '\n';
std::cout << std::hex << std::showbase
<< std::size_t(-1) << ' '
<< std::size_t(static_cast<unsigned int>(-1)) << '\n';
// "1"
// "0"
// "0xffffffffffffffff 0xffffffff"