According to accepted answer of this question What is the benefit of terminating if … else if constructs with an else clause?
There is a corruption case (in embedded system) that can cause a bool variable (which is 1 bit) differ to both True and False, it means the else
path in this code could be covered instead of be a dead code.
if (g_str.bool_variable == True) {
...
}
else if (g_str.bool_variable == False) {
...
}
else {
//handle error
}
I try to find out but there's still no clue for it.
Is it possible ? and How ?
Edit: For more clearly, I will give the declaration of the bool variable like:
struct {
unsigned char bool_variable : 1;
} g_str;
And also define:
#define True 1
#define False 0
The way you've defined things, this wouldn't happen on an
x86
. But it could happen with somecompiler/cpu
combination.Consider the following hypothetical assembly code for the
if-else-else
construct in question.Now consider the hypothetical machine code for some of these instructions.
One or more of bit corruptions shown above will cause the code to execute the second
else
block.unsigned char bool_variable : 1
is not a boolean variable. It is a 1 bit integer bit-field._Bool bool_variable
is a boolean variable.So right away
unsigned char bool_variable : 1
, it is implementation-defined if it is allowed.If such an implementation treated
unsigned char
bit-fields likeint
bit-fields, asunsigned char
range can fit inint
range, then troubles occur with 1-bitint
bit-fields. It is implementation defined if a 1-bitint
bit-field takes on the values of0, 1
or0, -1
. This leads to the//handle error
clause of thisif()
block.The solution is to simplify the
if()
test:With bit-fields, it is a corner in C where
unsigned int
,signed int
are different, butint
bit-fields less than the full width of anint
may be treated assigned int
orunsigned int
. With bit-fields, it is best to be explicit and use_Bool
,signed int
,unsigned int
. Note: usingunsigned
is synonymous withunsigned int
.The reason I wrote that example like it did, using "mybool",
FALSE
andTRUE
, was to indicate that this is a non-standard/pre-standard boolean type.Before C got language support for boolean types, you would invent your own boolean type like this:
or possibly:
In either situation you get a
BOOL
type which is larger than 1 bit, and can therefore either be 0, 1 or something else.Had I written the same example using stdbool
bool
/_Bool
,false
andtrue
it wouldn't have made any sense. Because then the compiler might implement the code as a bit-field and a single bit can only have values 1 or 0.In retrospect, a better example of the use of defensive programming might have been something like this:
This code may have a race condition. The magnitude of the problem will depend on exactly what the compiler emits when it compiles this code.
Here's what might be happening. Your code first checks
bool_variable == True
, which evaluates false. Execution skips the first block and jumps to theelse if
. Your code then checksbool_variable == False
, which also evaluates false so you fall into the finalelse
. You are doing two discrete tests onbool_variable
. Something else (such as another thread or an ISR) may be altering the value ofbool_variable
during the brief window of time after the first test has run and before the second test.You can avoid the problem completely by using
if (bool == True) {} else {}
instead of re-testing for false. That version would only check the value once, eliminating the window where corruption can happen. The separateFalse
check doesn't really buy you anything in the first place since by definition a one-bit-wide field can only take on two possible values, so!True
must be the same asFalse
. Even if you were using a larger boolean type that could technically take on more than two discrete values, you should be using it as if it could only have two (such as 0=false, everything else=True).This hints at a much larger problem, though. Even with only one variable check instead of two, you have one thread reading the variable and another altering it at practically the same time. The corruption occurring immediately before the
True
check would possibly still give you erroneous results but be even harder to detect. You need some sort of locking mechanism (mutex, spinlock, etc) to ensure that only one thread is accessing that field at a time.The only way to prove any of this for certain, though, is to step through it with a debugger or hardware probe and watch the value change between the two tests. If that's not an option, you may be able to de-couple the blocks by changing the
else if
toif
and storing the value ofbool_variable
before each of the two tests. Any time the two differ, then something external has corrupted your value.