Why is this an undefined behavior?

2019-03-22 18:49发布

My answer to this question was this function:

inline bool divisible15(unsigned int x) 
{
    //286331153 = (2^32 - 1) / 15
    //4008636143 = (2^32) - 286331153
    return x * 4008636143 <= 286331153;
}

It perfectly worked on my machine with VS2008 compiler, however here it doesn't work at all.

Does anyone has an idea, why it I get different results on different compilers? unsigned overflow isn't undefined behavior.

Important note: after some test it was confirmed it is faster than taking the remainder of the division by 15. (However not on all compilers)

2条回答
Explosion°爆炸
2楼-- · 2019-03-22 19:04

It's not Undefined Behavior, it's just a breaking change in the C language standard between C89 and C99.

In C89, integer constants like 4008636143 that don't fit in an int or long int but do fit in an unsigned int are unsigned, but in C99, they are either long int or long long int (depending on whichever one is the smallest one that can hold the value). As a result, the expressions all get evaluated using 64 bits, which results in the incorrect answer.

Visual Studio is a C89 compiler and so results in the C89 behavior, but that Ideone link is compiling in C99 mode.

This becomes more evident if you compile with GCC using -Wall:

test.c: In function ‘divisible15’:
test.c:8:3: warning: this decimal constant is unsigned only in ISO C90

From C89 §3.1.3.2:

The type of an integer constant is the first of the corresponding list in which its value can be represented. Unsuffixed decimal: int, long int, unsigned long int; unsuffixed octal or hexadecimal: int, unsigned int, long int, unsigned long int; [...]

C99 §6.4.4.1/5-6:

5) The type of an integer constant is the first of the corresponding list in which its value can be represented.

Suffix | Decimal Constant | Octal or Hexadecimal Constant
-------+------------------+------------------------------
none   | int              | int
       | long int         | unsigned int
       | long long int    | long int
       |                  | unsigned long int
       |                  | long long int
       |                  | unsigned long long int
-------+------------------+------------------------------
[...]

6) If an integer constant cannot be represented by any type in its list, it may have an extended integer type, if the extended integer type can represent its value. If all of the types in the list for the constant are signed, the extended integer type shall be signed. [...]

For completeness, C++03 actually does have Undefined Behavior for when the integer constant is too big to fit in a long int. From C++03 §2.13.1/2:

The type of an integer literal depends on its form, value, and suffix. If it is decimal and has no suffix, it has the first of these types in which its value can be represented: int, long int; if the value cannot be represented as a long int, the behavior is undefined. If it is octal or hexadecimal and has no suffix, it has the first of these types in which its value can be represented: int, unsigned int, long int, unsigned long int. [...]

The C++11 behavior is identical to C99, see C++11 §2.14.2/3.

To ensure that the code behaves consistently when compiled as either C89, C99, C++03, and C++11, the simple fix is to make the constant 4008636143 unsigned by suffixing it with u as 4008636143u.

查看更多
来,给爷笑一个
3楼-- · 2019-03-22 19:05

Since you are using int constants, the compiler can "use" the overflow is undefined to shortcut the code. If you modify with U as below, it "works".

inline bool divisible15(unsigned int x) 
{
    //286331153 = (2^32 - 1) / 15
    //4008636143 = (2^32) - 286331153
    return x * 4008636143u <= 286331153u;
}

testing with:

#include <iostream>


inline bool divisible15a(unsigned int x) 
{
    //286331153 = (2^32 - 1) / 15
    //4008636143 = (2^32) - 286331153
//    return x * 4008636143 <= 286331153;
    return x * 4008636143u <= 286331153;
}

inline bool divisible15b(unsigned int x) 
{
    //286331153 = (2^32 - 1) / 15
    //4008636143 = (2^32) - 286331153
//    return x * 4008636143 <= 286331153;
    return x * 4008636143 <= 286331153;
}


int main()
{
    for(unsigned int i = 0; i < 100; i++)
    {
    if (divisible15a(i))
    {
        std::cout << "a:" << i << std::endl;
    }
    if (divisible15b(i))
    {
        std::cout << "b:" << i << std::endl;
    }
    }
}

Output:

a:0
b:0
a:15
a:30
a:45
a:60
a:75
a:90

Code:

_Z12divisible15aj:
.LFB1192:
    pushq   %rbp
    movq    %rsp, %rbp
    movl    %edi, -4(%rbp)
    movl    -4(%rbp), %eax
    imull   $-286331153, %eax, %eax
    cmpl    $286331153, %eax
    setbe   %al
    popq    %rbp
    ret

_Z12divisible15bj:
.LFB1193:
    pushq   %rbp
    movq    %rsp, %rbp
    movl    %edi, -4(%rbp)
    movl    -4(%rbp), %edx
    movl    $4008636143, %eax
    imulq   %rdx, %rax
    cmpq    $286331153, %rax
    setle   %al
    popq    %rbp
    ret

So, yes, I agree with Carl/Adam's analysis that it doesn't fit in a 32-bit int, so therefore promoted to long or long long.

查看更多
登录 后发表回答