The new C++ standard still refuses to specify the binary representation of integer types. Is this because there are real-world implementations of C++ that don't use 2's complement arithmetic? I find that hard to believe. Is it because the committee feared that future advances in hardware would render the notion of 'bit' obsolete? Again hard to believe. Can anyone shed any light on this?
Background: I was surprised twice in one comment thread (Benjamin Lindley's answer to this question). First, from piotr's comment:
Right shift on signed type is undefined behaviour
Second, from James Kanze's comment:
when assigning to a long, if the value doesn't fit in a long, the results are
implementation defined
I had to look these up in the standard before I believed them. The only reason for them is to accommodate non-2's-complement integer representations. WHY?
It seems to me that, even today, if you are writing a broadly-applicable C++ library that you expect to run on any machine, then 2's complement cannot be assumed. C++ is just too widely used to be making assumptions like that.
Most people don't write those sorts of libraries, though, so if you want to take a dependency on 2's complement you should just go ahead.
I suppose it is because the Standard says, in 3.9.1[basic.fundamental]/7
this International Standard permits 2’s complement, 1’s complement and signed magnitude representations for integral types.
which, I am willing to bet, came along from the C programming language, which lists sign and magnitude, two's complement, and one's complement as the only allowed representations in 6.2.6.2/2
. And there sure were 1's complement systems around when C was wide-spread: UNIVACs are the most often mentioned, it seems.
Many aspects of the language standard are as they are because the Standards Committee has been extremely loath to forbid compilers from behaving in ways that existing code may rely upon. If code exists which would rely upon one's complement behavior, then requiring that compilers behave as though the underlying hardware uses two's complement would make it impossible for the older code to run using newer compilers.
The solution, which the Standards Committee has alas not yet seen fit to implement, would be to allow code to specify the desired semantics for things in a fashion independent of the machine's word size or hardware characteristics. If support for code which relies upon ones'-complement behavior is deemed important, design a means by which code could expressly demand one's-complement behavior regardless of the underlying hardware platform. If desired, to avoid overly complicating every single compiler, specify that certain aspects of the standard are optional, but conforming compilers must document which aspects they support. Such a design would allow compilers for ones'-complement machines to support both two's-complement behavior and ones'-complement behavior depending upon the needs of the program. Further, it would make it possible to port the code to two's-complement machines with compilers that happened to include ones'-complement support.
I'm not sure exactly why the Standards Committee has as yet not allowed any way by which code can specify behavior in a fashion independent of the underlying architecture and word size (so that code wouldn't have some machines use signed semantics for comparisons where other machines would use unsigned semantics), but for whatever reason they have yet to do so. Support for ones'-complement representation is but a part of that.