In C++, if I read an integer from a string, it seems it does not really matter whether I use u
or d
as conversion specifier as both accept even negative integers.
#include <cstdio>
using namespace std;
int main() {
int u, d;
sscanf("-2", "%u", &u);
sscanf("-2", "%d", &d);
puts(u == d ? "u == d" : "u != d");
printf("u: %u %d\n", u, u);
printf("d: %u %d\n", d, d);
return 0;
}
Ideone.com
I dug deeper to find if there is any difference. I found that
int u, d;
sscanf("-2", "%u", &u);
sscanf("-2", "%d", &d);
is equivalent to
int u, d;
u = strtoul("-2", NULL, 10);
d = strtol("-2", NULL, 10);
according to cppreference.com.
Is there any difference at all between u
and d
when using these conversion specifiers for parsing, i.e. in format passed to scanf
-type functions? What is it?
The answer is the same for C and C++, right? If not, I am interested in both.
%d
: Scan an integer as a decimal signed int
. A similar conversion specifier, %i
, interprets the number as hexadecimal when preceded by 0x
and as octal when preceded by 0
. Otherwise, it is identical.
%u
: Scan an integer as a decimal unsigned int
.
Technically, you are invoking undefined behavior when trying to read a negative number into int
using %u
format specifier. You make sscanf
treat pointer to signed integer as pointer to unsigned integer and those types are not compatible. It only works because both unsigned and signed ints have similar bit representation and signed integers use 2-complement representation.
TL/DR: You are not guaranteed to get -2 from sscanf("-2", "%u", &u);
Each conversion specifier has a corresponding type of the result argument defined in the C spec. The %u
and %d
conversion directives really accept the same inputs, as you observed, but the argument corresponding to %u
shall be of type unsigned int*
, not int*
. I.e. your example should be corrected as:
unsigned int u;
int d;
sscanf("-2", "%u", &u);
sscanf("-2", "%d", &d);
Had you enabled warnings, you’d get one when compiling the original example. And rightfully so:
Unless assignment suppression was indicated by a *, the result of the conversion is placed in the object pointed to by the first argument following the format argument that has not already received a conversion result. If this object does not have an appropriate type, or if the result of the conversion cannot be represented in the object, the behavior is undefined.
Emphasis mine.
So, you were invoking undefined behavior (see the part What is Undefined Behavior). Once you invoke undefined behavior, you’re alone and nasty things may happen.
Conversion modifiers are defined in C99 (latest public draft, N1256; official PDF). The definition is the same as in C11 (latest public draft, N1570; official PDF). The most recent C++ draft (as of 2015-02-10, N4567) linked from the list of C++ standard documents under another question on Stack Overflow takes the definition of cstdio
header from C99 and does not modify it (apart from placing the functions into the std
namespace and the minor modifications mentioned in § 27.9.2).