For example,
#include <iostream>
int main() {
unsigned n{};
std::cin >> n;
std::cout << n << ' ' << (bool)std::cin << std::endl;
}
When input -1
, clang 6.0.0 outputs 0 0
while gcc 7.2.0 outputs 4294967295 1
. I'm wondering who is correct. Or maybe both are correct for the standard does not specify this? By fail, I take to mean (bool)std::cin
be evaluated false. clang 6.0.0 fails input -0
too.
As of Clang 9.0.0 and GCC 9.2.0, both compilers, using either libstdc++ or libc++ in the case of Clang, agree on the result of the program above, independent of the C++ version (>= C++11) used, and print
4294967295 1
i.e. they set the value to ULLONG_MAX
and do not set the failbit on the stream.
I think that both are wrong in C++171 and that the expected output should be:
4294967295 0
While the returned value is correct for the latest versions of both compilers, I think that the ios_base::failbit
should be set, but I also think there is a confusion about the notion of field to be converted in the standard which may account for the current behaviors.
The standard says — [facet.num.get.virtuals#3.3]:
The sequence of chars accumulated in stage 2 (the field) is converted to a numeric value by the rules of one of the functions declared in the header <cstdlib>
:
For a signed integer value, the function strtoll
.
For an unsigned integer value, the function strtoull
.
For a floating-point value, the function strtold
.
So we fall back to std::strtoull
, which must return2 ULLONG_MAX
and not set errno
in this case (which is what both compilers do).
But in the same block (emphasis is mine):
The numeric value to be stored can be one of:
zero, if the conversion function does not convert the entire field.
the most positive (or negative) representable value, if the field to be converted to a signed integer type represents a value too large positive (or negative) to be represented in val
.
the most positive representable value, if the field to be converted to an unsigned integer type represents a value that cannot be represented in val
.
the converted value, otherwise.
The resultant numeric value is stored in val
. If the conversion function does not convert the entire field, or if the field represents a value outside the range of representable values, ios_base::failbit
is assigned to err
.
Notice that all these talks about the "field to be converted" and not the actual value returned by std::strtoull
. The field here is actually the widened sequence of character '-', '1'
.
Since the field represents a value (-1) that cannot be represented by an unsigned
, the returned value should be UINT_MAX
and the failbit should be set on std::cin
.
1clang
was actually right prior to C++17 because the third bullet in the above quote was:
- the most negative representable value or zero for an unsigned integer type, if the field represents a value too large negative to be represented in val
. ios_base::failbit
is assigned to err
.
2 std::strtoull
returns ULLONG_MAX
because (thanks @NathanOliver) — C/7.22.1.4.5:
If the subject sequence has the expected form and the value of base is zero, the sequence of characters starting with the first digit is interpreted as an integer constant according to the rules of 6.4.4.1.
[...]
If the subject sequence begins with a minus sign, the value resulting from the conversion is negated (in the return type).
The question is about differences between the library implementations libc++ and libstdc++ - and not so much about differences between the compilers(clang, gcc).
cppreference clears these inconsistencies up pretty well:
The result of converting a negative number string into an unsigned
integer was specified to produce zero until c++17, although some
implementations followed the protocol of std::strtoull
which negates
in the target type, giving ULLONG_MAX
for "-1", and so produce the
largest value of the target type instead. As of c++17, strictly
following std::strtoull
is the correct behavior.
This summarises to:
ULLONG_MAX
(4294967295
) is correct going forward, since c++17 (both compilers do it correct now)
- Previously it should have been
0
with a strict reading of the standard (libc++)
- Some implementations (notably libstdc++) followed
std::strtoull
protocol instead (which now is considered the correct behavior)
The failbit set and why it was set, might be a more interesting question (at least from the language-lawyer perspective). In libc++ (clang) version 7 it now does the same as libstdc++ - this seems to suggest that it was chosen to be same as going forward (even though this goes against the letter of standard, that it should be zero before c++17) - but so far I've been unable to find changelog or documentation for this change.
The interesting block of text reads (assuming pre-c++17):
If the conversion function results in a negative value too large to
fit in the type of v, the most negative representable value is stored
in v, or zero for unsigned integer types.
According to this, the value is specified to be 0
. Additionally, no where is it indicated that this should result in setting the failbit.
The intended semantics of your std::cin >> n
command are described here (as, apparently, std::num_get::get()
is called for this operation). There have been some semantics changes in this function, specifically w.r.t. the choice of whether to place 0 or not, in C++11 and then again in C++17.
I'm not entirely sure, but I believe these differences may account for the different behavior you're seeing.