I know, right shifting a negative signed type depends on the implementation, but what if I perform a left shift? For example:
int i = -1;
i << 1;
Is this well-defined?
I think the standard doesn't say about negative value with signed type
if E1 has a signed type and non-negative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
It only clarifies that if the result isn't representable in signed type then the behavior is undefined.
When the C standards were codified, different platforms would do different things when left-shifting negative integers. On some of them, the behavior might trigger implementation-specific traps whose behavior could be outside a program's control, and which could include random code execution. Nonetheless, it was possible that programs written for such platforms might make use of such behavior (a program could e.g. specify that a user would have to do something to configure a system's trap handlers before running it, but the program could then exploit the behavior of the suitably-configured trap handlers).
The authors of the C standard did not want to say that compilers for machines where left-shifting of negative numbers would trap must be modified to prevent such trapping (since programs might potentially be relying upon it), but if left-shifting a negative number is allowed to trigger a trap which could cause any arbitrary behavior (including random code execution) that means that left-shifting a negative number is allowed to do anything whatsoever. Hence Undefined Behavior.
In practice, until about 5 years ago, 99+% of compilers written for a machine that used two's-complement math (meaning 99+% of machines made since 1990) would consistently yield the following behaviors for
x<<y
andx>>y
, to the extent that code reliance upon such behavior was considered no more non-portable than code which assumedchar
was 8 bits. The C standard didn't mandate such behavior, but any compiler author wanting to be compatible with a wide base of existing code would follow it.y
is a signed type,x << y
andx >> y
are evaluated as thoughy
was cast to unsigned.x
is typeint
,x<<y
is equivalent to(int)((unsigned)x << y)
.x
is typeint
and positive,x>>y
equivalent to(unsigned)x >> y
. Ifx
is of typeint
and negative,x>>y
is equivalent to `~(~((unsigned)x) >> y).x
is of typelong
, similar rules apply, but withunsigned long
rather thanunsigned
.x
is an N-bit type andy
is greater than N-1, thenx >> y
andx << y
may arbitrarily yield zero, or may act as though the right-hand operand wasy % N
; they may require extra time proportional toy
[note that on a 32-bit machine, ify
is negative, that could potentially be a long time, though I only know of one machine which would in practice run more than 256 extra steps]. Compilers were not necessarily consistent in their choice, but would always return one of the indicated values with no other side-effects.Unfortunately for some reason I can't quite fathom, compiler writers have decided that rather than allowing programmers to indicate what assumptions compilers should use for dead-code removal, compilers should assume that it is impossible to execute any shift whose behavior isn't mandated by the C standard. Thus, given code like the following:
a compiler may determine that because the code would engage in Undefined Behavior when n is 32 or larger, the compiler may assume that the
if
will never return true, and may thus omit the code. Consequently, unless or until someone comes up with a standard for C which restores the classic behaviors and allows programmers to designate what assumptions merit dead code removal, such constructs cannot be recommended for any code that might be fed to a hyper-modern compiler.You're not reading that sentence correctly. The standard defines it if: the left operand has a signed type and a non-negative value and the result is representable (and previously in the same paragraph defines it for unsigned types). In all other cases (notice the use of the semicolon in that sentence), i.e, if any of these conditions isn't verified, the behaviour is undefined.
This includes the