Suppose I have some legacy code which cannot be changed unless a bug is discovered, and it contains this code:
bool data[32];
memset(data, 0, sizeof(data));
Is this a safe way to set all bool
in the array to a false
value?
More generally, is it safe to memset
a bool
to 0
in order to make its value false
?
Is it guaranteed to work on all compilers? Or do I to request a fix?
I believe this unspecified although it seems likely the underlying representation of false
would be all zeros. Boost.Container relies on this as well (emphasis mine):
Boost.Container uses std::memset with a zero value to initialize some
types as in most platforms this initialization yields to the desired
value initialization with improved performance.
Following the C11 standard, Boost.Container assumes that for any
integer type, the object representation where all the bits are zero
shall be a representation of the value zero in that type. Since
_Bool/wchar_t/char16_t/char32_t are also integer types in C, it considers all C++ integral types as initializable via std::memset.
This C11 quote they they point to as a rationale actually comes from a C99 defect: defect 263: all-zero bits representations which added the following:
For any integer type, the object representation where all the bits are
zero shall be a representation of the value zero in that type.
So then the question here is the assumption correct, are the underlying object representation for integer compatible between C and C++?
The proposal Resolving the difference between C and C++ with regards to object representation of integers sought to answer this to some extent which as far as I can tell was not resolved. I can not find conclusive evidence of this in the draft standard. We have a couple of cases where it links to the C standard explicitly with respect to types. Section 3.9.1
[basic.fundamental] says:
[...] The signed and unsigned integer types shall satisfy the
constraints given in the C standard, section 5.2.4.2.1.
and 3.9
[basic.types] which says:
The object representation of an object of type T is the sequence of N
unsigned char objects taken up by the object of type T, where N equals
sizeof(T). The value representation of an object is the set of bits
that hold the value of type T. For trivially copyable types, the value
representation is a set of bits in the object representation that
determines a value, which is one discrete element of an
implementation-defined set of values.44
where footnote 44(which is not normative) says:
The intent is that the memory model of C++ is compatible with that of
ISO/IEC 9899 Programming Language C.
The farthest the draft standard gets to specifying the underlying representation of bool is in section 3.9.1
:
Types bool, char, char16_t, char32_t, wchar_t, and the signed and
unsigned integer types are collectively called integral types.50 A
synonym for integral type is integer type. The representations of
integral types shall define values by use of a pure binary numeration
system.51 [ Example: this International Standard permits 2’s
complement, 1’s complement and signed magnitude representations for
integral types. —end example ]
the section also says:
Values of type bool are either true or false.
but all we know of true
and false
is:
The Boolean literals are the keywords false and true. Such literals
are prvalues and have type bool.
and we know they are convertible to 0
an 1
:
A prvalue of type bool can be converted to a prvalue of type int, with
false becoming zero and true becoming one.
but this gets us no closer to the underlying representation.
As far as I can tell the only place where the standard references the actual underlying bit value besides padding bits was removed via defect report 1796: Is all-bits-zero for null characters a meaningful requirement? :
It is not clear that a portable program can examine the bits of the representation; instead, it would appear to be limited to examining the bits of the numbers corresponding to the value representation (3.9.1 [basic.fundamental] paragraph 1). It might be more appropriate to require that the null character value compare equal to 0 or '\0' rather than specifying the bit pattern of the representation.
There are more defect reports that deal with the gaps in the standard with respect to what is a bit and difference between the value and object representation.
Practically, I would expect this to work, I would not consider it safe since we can not nail this down in the standard. Do you need to change it, not clear, you clearly have a non-trivial trade-off involved. So assuming it works now the question is do we consider it likely to break with future versions of various compilers, that is unknown.
Is it guaranteed by the law? No.
C++ says nothing about the representation of bool
values.
Is it guaranteed by practical reality? Yes.
I mean, if you wish to find a C++ implementation that does not represent boolean false
as a sequence of zeroes, I shall wish you luck. Given that false
must implicitly convert to 0
, and true
must implicitly convert to 1
, and 0
must implicitly convert to false
, and non-0
must implicitly convert to true
… well, you'd be silly to implement it any other way.
Whether that means it's "safe" is for you to decide.
I don't usually say this, but if I were in your situation I would be happy to let this slide. If you're really concerned, you can add a test executable to your distributable to validate the precondition on each target platform before installing the real project.
No. It is not safe (or more specifically, portable). However, it likely works by virtue of the fact that your typical implementation will:
- use 0 to represent a boolean (actually, the C++ specification requires it)
- generate an array of elements that
memset()
can deal with.
However, best practice would dictate using bool data[32] = {false}
- additionally, this will likely free the compiler up to internally represent the structure differently - since using memset()
could result in it generating a 32 byte array of values rather than, say, a single 4 byte that will fit nicely within your average CPU register.
From 3.9.1/7:
Types bool , char , char16_t , char32_t , wchar_t , and the signed and
unsigned integer types are collectively called integral types. A
synonym for integral type is integer type . The representations of
integral types shall define values by use of a pure binary numeration
system.
Given this I can't see any possible implementation of bool
that wouldn't represent false as all 0 bits.