Consider the fundamental signed integer types of C++, namely: signed char
, short int
, int
, long int
and long long int
, what does the current C++ standard require about their underlying bit representation?
Does the constraints on their bit representation specify that they should include:
- optional padding bits
- mandatory value bits
- a mandatory sign bit that is
0
for positive values, and 1
for negative value
- if it exists, the sign bit should be the most significant bit
Is this true? If not, then what are the constaints? I am searching for quotes from the standard that proves or disproves this.
EDIT: I am asking this question, because, in C, the standard says:
6.2.6.2.2:
For signed integer types, the bits of the object representation shall
be divided into three groups: value bits, padding bits, and the sign
bit. There need not be any padding bits; signed char shall not have
any padding bits. There shall be exactly one sign bit. Each bit that
is a value bit shall have the same value as the same bit in the object
representation of the corresponding unsigned type (if there are M
value bits in the signed type and N in the unsigned type, then M ≤ N
). If the sign bit is zero, it shall not affect the resulting value.
If the sign bit is one, the value shall be modified in one of the
following ways:
- the corresponding value with sign bit 0 is negated (sign and magnitude);
- the sign bit has the value −(2^M ) (two’s complement);
- the sign bit has the value −(2^M − 1) (ones’complement).
Which of these applies is implementation-defined, as is
whether the value with sign bit 1 and all value bits zero (for the
first two), or with sign bit and all value bits 1 (for ones’
complement), is a trap representation or a normal value. In the case
of sign and magnitude and ones’ complement, if this representation is
a normal value it is called a negative zero.
So I am wondering whether something comparable exists in C++
This is what C++11 says about representation of signed integer types:
C++11 N3337 3.9.1 [basic.fundamental] P7:
The representations of integral types
shall define values by use of a pure binary numeration system. 49 [ Example: this International Standard permits 2’s complement, 1’s complement and signed magnitude representations for integral types. — end
example ]
where Footnote 49 reads:
49) A positional representation for integers that uses the binary digits 0 and 1, in which the values represented by successive bits are additive, begin with 1, and are multiplied by successive integral power of 2, except perhaps for the bit with the highest
position. (Adapted from the American National Dictionary for Information Processing Systems.)
Thus C++ allows the same three options as C, as well as anything else satisfying Footnote 49. Which is a superset of what C allows. By Footnote 49, however, only the highest bit is allowed to have special meaning.
I'm guessing the answer to the question you asked is no.
I think the C++ Standard specifies the minimum size, and the range of values that each integer type must be able to represent. I don't believe the standard speaks specifically to any of the constraints you list.
I think those are all implementation details.
I think it would be odd to find a C++ implementation that used more than a single bit to hold the sign, and not use a 0 for positive and 1 for negative. But I don't think the C++ Standard specifically requires it.
The C++ Standard is based in particularly on the C Standard where there is written (6.2.6.2 Integer types)
2 For signed integer types, the bits of the object representation
shall be divided into three groups: value bits, padding bits, and the
sign bit. There need not be any padding bits; signed char shall not
have any padding bits. There shall be exactly one sign bit.....
The requirement that there be exactly one sign bit means that it must be possible to identify a bit which is set for all negative numbers, and clear for all non-negative numbers. An implementation may include within an "int" any number of padding bits, impose arbitrary restrictions on their values, and treat as trap representations any bit patterns that violate those requirements, provided that all calculations that yield defined integer values produce bit patterns that the implementation will accept.
For example, an implementation could store "int" as two 16-bit words and specifies that the MSB of of the first word is the sign bit. Such an implementation could write 0-14 of the first word match the sign bit and
trap when reading any value where they don't, or make those bits match bits
1-15 of the second word (likewise trapping), or could write arbitrary values
to those bits and ignore them when reading, or do just about anything else
with them. If an implementation always wrote the top word as all ones or
all zero, any bit could be designated the "sign bit" and it wouldn't matter;
the rest would all be "padding bits".
The requirement that there be a single sign bit would mainly rule out
implementations where e.g. positive numbers may arbitrarily be represented
as bit pattern 00 or 11, and negative numbers as 01 or 10. On such an
implementation, it would be necessary to examine two bits rather than one
to determine whether a number was negative or non-negative.