Say I have an expression like this
short v = ( ( p[ i++ ] & 0xFF ) << 4 | ( p[ i ] & 0xF0000000 ) >> 28;
with p
being a pointer to a dynamically allocated array of 32 bit integers.
When exactly will i
be incremented? I noticed that the above code delivers a different value for v
than the following code:
short v = ( p[ i++ ] & 0xFF) << 4;
v |= ( p[ i ] & 0xF0000000 ) >> 28;
My best guess for this behaviour is that i
is not incremented before the right side of the above |
is evaluated.
Any insight would be appreciated!
Thanks in advance,
\Bjoern
Sometimes before the end of the expression.
It is undefined to read an object which is also modified for something else than determining the new value as it is undefined to write an object twice. And you may even get inconsistant value (i.e. reading something which isn't the old nor the new value).
The first example is undefined behavior. You cannot read a variable more than once in an expression that also changes the value of the variable. See this (among other places on the Internet).
Your expression has undefined behavior, see for example this about sequence points in C and C++ statements.
The problem is order of evaluation:
The C++ standard does not define the order of evaluation of sub expressions. This is done so that the compiler can be as aggressive as possible in optimizations.
Lets break it down:
Now the compiler is free to re-arrange thus sub expressions as long as the above 'after' clauses are not violated. So one quick easy optimization is move 3 up one slot and then do common expression removal (1) and (3) (now beside each other) are the same and thus we can eliminate (3)
But the compiler does not have to do the optimization (and is probably better than me at it and has other tricks up its sleeve). But you can see how the value of (a1) will always be what you expect, but the value of (a2) will depend on what order the compiler decides to do the other sub-expressions.
The only guarantees that you have that the compiler can not move sub-expressions past a sequence point. Your most common sequence point is ';' (the end of the statement). There are others, but I would avoid using this knowledge as most people don't know the compiler workings that well. If you write code that uses sequence point tricks then somebody may re-factor the code to make it look more readable and now your trick has just turned into undefined be-behavior.
Here everything is well defined as the write to i is sued in place and not re-read in the same expression.
Simple rule. don't use ++ or -- operators inside a larger expression. Your code looks just as readable like this:
See this article for detailed explanation of evaluation order:
What are all the common undefined behaviours that a C++ programmer should know about?
i is incremented sometime before the next sequence point. The only sequence point in the expression you have given is at the end of the statement - so "sometime before the end of the statement" is the answer in this case.
That's why you shouldn't both modify an lvalue and read its value without an intervening sequence point - the result is indeterminate.
The &&, ||, comma and ? operators introduce sequence points, as well as the end of an expression and a function call (the latter means that if you do f(i++, &i), the body of f() will see the updated value if it uses the pointer to examine i).