Is `long` guaranteed to be at least 32 bits?

2019-01-03 10:20发布

By my reading of the C++ Standard, I have always understood that the sizes of the integral fundamental types in C++ were as follows:

sizeof(char) <= sizeof(short int) <= sizeof(int) <= sizeof(long int)

I deduced this from 3.9.1/2:

  1. There are four signed integer types: “signed char”, “short int”, “int”, and “long int.” In this list, each type provides at least as much storage as those preceding it in the list. Plain ints have the natural size suggested by the architecture of the execution environment

Further, the size of char is described by 3.9.1/ as being:

  1. [...] large enough to store any member of the implementation’s basic character set.

1.7/1 defines this in more concrete terms:

  1. The fundamental storage unit in the C + + memory model is the byte. A byte is at least large enough to contain any member of the basic execution character set and is composed of a contiguous sequence of bits, the number of which is implementation-defined.

This leads me to the following conclusion:

1 == sizeof(char) <= sizeof(short int) <= sizeof(int) <= sizeof(long int)

where sizeof tells us how many bytes the type is. Furthermore, it is implementation-defined how many bits are in a byte. Most of us are probably used to dealing with 8-bit bytes, but the Standard says there are n bits in a byte.


In this post, Alf P. Steinbach says:

long is guaranteed (at least) 32 bits.

This flies in the face of everything I understand the size of the fundamental types to be in C++ according to the Standard. Normally I would just discount this statement as a beginner being wrong, but since this was Alf I decided it was worth investigating further.

So, what say you? Is a long guaranteed by the standard to be at least 32 bits? If so, please be specific as to how this guarantee is made. I just don't see it.

  1. The C++ Standard specifically says that in order to know C++ you must know C (1.2/1) 1

  2. The C++ Standard implicitly defines the minimum limit on the values a long can accommodate to be LONG_MIN-LONG_MAX 2

So no matter how big a long is, it has to be big enough to hold LONG_MIN to LONG_MAX.

But Alf and others are specific that a long must be at least 32 bits. This is what I'm trying to establish. The C++ Standard is explicit that the number of bits in a byte are not specified (it could be 4, 8, 16, 42) So how is the connection made from being able to accommodate the numbers LONG_MIN-LONG_MAX to being at least 32 bits?


(1) 1.2/1: The following referenced documents are indispensable for the application of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.

  • ISO/IEC 2382 (all parts), Information technology – Vocabulary
  • ISO/IEC 9899:1999, Programming languages – C
  • ISO/IEC 10646-1:2000, Information technology – Universal Multiple-Octet Coded Character Set (UCS) – Part 1: Architecture and Basic Multilingual Plane

(2) Defined in <climits> as:

LONG_MIN -2147483647 // -(2^31 - 1)
LONG_MAX +2147483647 //   2^31 - 1

5条回答
爱情/是我丢掉的垃圾
2楼-- · 2019-01-03 10:23

Yes, the C++ standard is explicit that the number of bits in a byte is not specified. The number of bits in a long isn't specified, either.

Setting a lower bound on a number is not specifying it.

The C++ standard says, in one place:

1 == sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long).

It says, in effect, in another place, via inclusion of the C standard:

CHAR_BITS >= 8; SHORT_BITS >= 16; INT_BITS >= 16; LONG_BITS >= 32

(except that AFAIK, the identifiers SHORT_BITS, INT_BITS and LONG_BITS don't exist, and that these limits are inferred by the requirements for minimum values on the types.)

This follows from the fact that a certain number of bits are required, mathematically, to encode all of the values in the (e.g. for longs) LONG_MIN..LONG_MAX range.

Finally, shorts, ints and longs must all be made up of an integral number of chars; sizeof() always reports an integral value. Also, iterating through memory char by char must access every bit, which places some practical limitations.

These requirements are not inconsistent in any way. Any sizes that satisfy the requirements are OK.

There were machines long ago with a native word size of 36 bits. If you were to port a C++ compiler to them, you could legally decide to have 9 bits in a char, 18 in both short and int, and 36 in long. You could also legally decide to have 36 bits in each of those types, for the same reason that you can have 32 bits in an int on a typical 32-bit system today. There are real-world implementations that use 64-bit chars.

See also sections 26.1-6 and 29.5 of the C++ FAQ Lite.

查看更多
地球回转人心会变
3楼-- · 2019-01-03 10:31

But Alf and others are specific that a long must be at least 32 bits. This is what I'm trying to establish. The C++ Standard is explicit that the number of bits in a byte are not specified. Could be 4, 8, 16, 42... So how is the connection made from being able to accomodate the numbers LONG_MIN-LONG_MAX to being at least 32 bits?

You need 32 bits in the value representation in order to get at least that many bitpatterns. And since C++ requires a binary representation of integers (explicit language to that effect in the standard, §3.9.1/7), Q.E.D.

查看更多
我欲成王,谁敢阻挡
4楼-- · 2019-01-03 10:35

The answer is definitively YES. Read my OP and all the comments to understand why exactly, but here's the short version. If you doubt or question any of this, I encourage you to read the entire thread and all of the comments. Otherwise accept this as true:

  1. The C++ standard includes parts of the C standard, including the definitions for LONG_MIN and LONG_MAX
  2. LONG_MIN is defined as no greater than -2147483647
  3. LONG_MAX is defined as no less than +2147483647
  4. In C++ integral types are stored in binary in the underlying representation
  5. In order to represent -2147483647 and +2147483647 in binary, you need 32 bits.
  6. A C++ long is guaranteed to be able to represent the minimum range LONG_MIN through LONG_MAX

Therefore a long must be at least 32 bits1.

EDIT:

LONG_MIN and LONG_MAX have values with magnitudes dictated by the C standard (ISO/IEC 9899:TC3) in section §5.2.4.2.1:

[...] Their implementation-defined values shall be equal or greater in magnitude [...] (absolute value) to those shown, with the same sign [...]

— minimum value for an object of type long int
LONG_MIN -2147483647 // -(2 ^ 31 - 1)
— maximum value for an object of type long int
LONG_MAX +2147483647 // 2 ^ 31 - 1

1 32 bits: This does not mean that sizeof (long) >= 4, because a byte is not necessarily 8 bits. According to the Standard, a byte is some unspecified (platform-defined) number of bits. While most readers will find this odd, there is real hardware on which CHAR_BIT is 16 or 32.

查看更多
太酷不给撩
5楼-- · 2019-01-03 10:39

C++ uses the limits defined in the C standard (C++: 18.3.2 (c.limits), C: 5.2.4.2.1):

LONG_MIN -2147483647 // -(2^31 - 1)
LONG_MAX +2147483647 //   2^31 - 1

So you are guaranteed that a long is at least 32 bits.

And if you want to follow the long circuitous route to whether LONG_MIN/LONG_MAX are representable by a long, you have to look at 18.3.1.2 (numeric.limits.members) in the C++ standard:

static constexpr T min() throw(); // Equivalent to CHAR_MIN, SHRT_MIN, FLT_MIN, DBL_MIN, etc.
static constexpr T max() throw(); // Equivalent to CHAR_MAX, SHRT_MAX, FLT_MAX, DBL_MAX, etc.

I moved the footnotes into the comment, so it's not exactly what appears in the standard. But it basically implies that std::numeric_limits<long>::min()==LONG_MIN==(long)LONG_MIN and std::numeric_limits<long>::max()==LONG_MAX==(long)LONG_MAX.

So, even though the C++ standard does not specify the bitwise representation of (signed) negative numbers, it has to either be twos-complement and require 32-bits of storage in total, or it has an explicit sign bit which means that it has 32-bits of storage also.

查看更多
一纸荒年 Trace。
6楼-- · 2019-01-03 10:44

The C++ standard notes that the contents of <climits> are the same as the C header <limits.h> (18.2.2 in ISO C++03 doc).

Unfortunately, I do not have a copy of the C standard that existed pre-C++98 (i.e. C90), but in C99 (section 5.2.4.2.1), <limits.h> has to have at least this minimum values. I don't think this changed from C90, other than C99 adding the long long types.

— minimum value for an object of type long int

LONG_MIN -2147483647 // −(2^31 − 1)

— maximum value for an object of type long int

LONG_MAX +2147483647 // 2^31 − 1

— maximum value for an object of type unsigned long int

ULONG_MAX 4294967295 // 2^32 − 1

— minimum value for an object of type long long int

LLONG_MIN -9223372036854775807 // −(2^63− 1)
查看更多
登录 后发表回答