I have a pretty basic question, but I am not sure if I understand the concept or not. Suppose we have:
int a = 1000000;
int b = 1000000;
long long c = a * b;
When I run this, c
shows negative value, so I changed also a
and b
to long long
and then everything was fine. So why do I have to change a
and b
, when their values are in range of int
and their product is assigned to c
(which is long long
)?
I am using C/C++
The int
s are not promoted to long long
before multiplication, they remain int
s and the product as well. Then the product is cast to long long
, but too late, overflow has struck.
Having one of a
or b
long long
should work as well, as the other would be promoted.
For arithmetic operators the type of the result is not dependent on what you are assigning the result to but the types of the operands. For arithmetic operators the usual arithmetic conversions are performed on the operands. This is used to bring the operands to a common type, this means for types smaller than unsigned/signed int if the values can fit they are promoted to unsigned/signed int, in this case they are already both int so no conversion is required. See Why must a short be converted to an int before arithmetic operations in C and C++? for the details on why.
What we have now is undefined behavior since signed integer overflow is undefined behavior, this is covered in the draft C++ standard section 5
[Expr] which says:
If during the evaluation of an expression, the result is not mathematically defined or not in the range of
representable values for its type, the behavior is undefined. [ Note: most existing implementations of C++
ignore integer overflows. Treatment of division by zero, forming a remainder using a zero divisor, and all
floating point exceptions vary among machines, and is usually adjustable by a library function. —end note ]
Now-a-days we have sanitizers to catch these types of undefined behavior and using -fsanitize=undefined
with both clang and gcc will catch this at runtime with the following error (see it live):
runtime error: signed integer overflow: 1000000 * 1000000 cannot be
represented in type 'int'
For reference section 5.6
[expr.mul] says:
[...]The usual arithmetic conversions are performed on the operands
and determine the type of the result.
and section 5
says:
Otherwise, the integral promotions (4.5) shall be performed on both operands.61 Then the following
rules shall be applied to the promoted operands
- If both operands have the same type, no further conversion is needed.
It's kind of absurd, because the assembler instruction does always compute
int * int -> 64 bits long
so if you look at the machine code, you see :
imul
that store 64bits into eax edx
then
cdq
that put the bit sign of eax into edx (thus losing the full 64bits result)
and then eax edx are stored into the 64bits variable
and if you convert the 32bits values into 64bits before the multiplication, you get a call to the 64bits multiplication function for no reason
(I checked : it's not the case when the code is optimized)