It's gotten a lot of attention lately that signed integer overflow is officially undefined in C and C++. However, a given implementation may choose to define it; in C++, an implementation may set std::numeric_limits<signed T>::is_modulo
to true
to indicate that signed integer overflow is well-defined for that type, and wraps like unsigned integers do.
Visual C++ sets std::numeric_limits<signed int>::is_modulo
to true
. This has hardly been a reliable indicator, since GCC set this to true for years and has undefined signed overflow. I have never encountered a case in which Visual C++'s optimizer has done anything but give wraparound behavior to signed integers - until earlier this week.
I found a case in which the optimizer emitted x86-64 assembly code that acted improperly if the value of exactly INT_MAX
was passed to a particular function. I can't tell whether it's a bug, because Visual C++ doesn't seem to state whether signed integer overflow is considered defined. So I'm wondering, is it supposed to be defined in Visual C++?
EDIT: I found this when reading about a nasty bug in Visual C++ 2013 Update 2 that wasn't in Update 1, where the following loop generates bad machine code if optimizations are enabled:
void func (int *b, int n)
{
for (int i = 0; i < n; i++)
b[i * (n + 1)] = 1;
}
That Update 2 bug results in the repeated line having its code generated as if it were b[i] = 1;
, which is clearly wrong. It turned into rep stosd
.
What was really interesting was that there was weirdness in the previous version, Update 1. It generated code that didn't properly handle the case that n
exactly equaled INT_MAX
. Specifically, if n
were INT_MAX
, the multiplication would act as if n
were long long
instead of int
- in other words, the addition n + 1
would not cause the result to become INT_MIN
as it should.
This was the assembly code in Update 1:
movsxd rax, edx ; RDX = 0x000000007FFFFFFF; RAX = 0x000000007FFFFFFF.
test edx, edx
jle short locret_76 ; Branch not taken, because EDX is nonnegative.
lea rdx, ds:4[rax*4] ; RDX = RAX * 4 + 4; RDX becomes 0x0000000200000000.
nop ; But it's wrong. RDX should now be 0xFFFFFFFE00000000.
loc_68:
mov dword ptr [rcx], 1
add rcx, rdx
dec rax
jnz short loc_68
locret_76:
retn
The issue is that I don't know whether this is a compiler bug - in GCC and Clang, this wouldn't be a compiler bug, because those compilers consider signed integer overflow/underflow to be undefined. Whether this is a bug in Visual C++ depends on whether Visual C++ considers signed integer overflow/underflow to be undefined.
Every other case I've seen besides this one has shown Visual C++ to consider signed overflow/underflow to be defined, hence the mystery.
Your example probably does have undefined behavior for
n == INT_MAX
, but not just because of signed integer overflow being undefined (which it may not be on the Microsoft compiler). Rather, you are probably invoking undefined out-of-bounds pointer arithmetic.Found an interesting tidbit from back 2016 (VS2015 Update 3):
They talk about the new SSA optimizer they want to introduce into VS2015:
So there you have it. I read that as: "we never programmed in any extra bits to make use of this UB", but starting from VS2015/Update3 we will have some.
I should note that even before that I'd be extremely wary, because for 64 bit code and 32bit variables, if the compiler/optimizer simply puts the 32bit signed int into a 64bit register, you'll have undefined no matter what. (As shown in "How not to code: Undefined behavior is closer than you think" - unfortunately, it's unclear from the blog post whether he used VS2015 pre or post Update3.)
So my take on this whole affair is that MSVC always considered it UB, even though past optimizer version did not take special advantage of the fact. The new SAA optimizer seems to do for sure. (would be interesting to test if the
–d2UndefIntOverflow–
switch does it's job.)