What does 'Natural Size' really mean in C+

2019-02-17 11:08发布

问题:

I understand that the 'natural size' is the width of integer that is processed most efficiently by a particular hardware. When using short in an array or in arithmetic operations, the short integer must first be converted into int.

Q: What exactly determines this 'natural size'?

I am not looking for simple answers such as

If it has a 32-bit architecture, it's natural size is 32-bit

I want to understand why this is most efficient, and why a short must be converted before doing arithmetic operations on it.

Bonus Q: What happens when arithmetic operations are conducted on a long integer?

回答1:

the 'natural size' is the width of integer that is processed most efficiently by a particular hardware.

Not really. Consider the x64 architecture. Arithmetic on any size from 8 to 64 bits will be essentially the same speed. So why have all x64 compilers settled on a 32-bit int? Well, because there was a lot of code out there which was originally written for 32-bit processors, and a lot of it implicitly relied on ints being 32-bits. And given the near-uselessness of a type which can represent values up to nine quintillion, the extra four bytes per integer would have been virtually unused. So we've decided that 32-bit ints are "natural" for this 64-bit platform.

Compare the 80286 architecture. Only 16 bits in a register. Performing 32-bit integer addition on such a platform basically requires splitting it into two 16-bit additions. Doing virtually anything with it involves splitting it up, really-- and an attendant slowdown. The 80286's "natural integer size" is most definitely not 32 bits.

So really, "natural" comes down to considerations like processing efficiency, memory usage, and programmer-friendliness. It is not an acid test. It is very much a matter of subjective judgment on the part of the architecture/compiler designer.



回答2:

Generally speaking, each computer architecture is designed such that certain type sizes provide the most efficient numerical operations. The specific size then depends on the architecture, and the compiler will select an appropriate size. More detailed explanations as to why hardware designers selected certain sizes for perticular hardware would be out of scope for stckoverflow.

A short most be promoted to int before performing integral operations because that's the way it was in C, and C++ inherited that behavior with little or no reason to change it, possibly breaking existing code. I'm not sure the reason it was originally added in C, but one could speculate that it's related to "default int" where if no type were specified int was assumed by the compiler.

Bonus A: from 5/9 (expressions) we learn: Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way. The purpose is to yield a common type, which is also the type of the result. This pattern is called the usual arithmetic conversions, which are defined as follows:

And then of interest specifically:

  • floating point rules that don't matter here
  • Otherwise, the integral promotions (4.5) shall be performed on both operands
  • Then, if either operand is unsigned long the other shall be converted to unsigned long.
  • Otherwise, if one operand is a long int and the other unsigned int, then if a long int can represent all the values of an unsigned int, the unsigned int shall be converted to a long int; otherwise both operands shall be converted to unsigned long int.
  • Otherwise, if either operand is long, the other shall be converted to long.

In summary the compiler tries to use the "best" type it can to do binary operations, with int being the smallest size used.



回答3:

What exactly determines this 'natural size'?

For some processors (e.g. 32-bit ARM, and most DSP-style processors), it's determined by the architecture; the processor registers are a particular size, and arithmetic can only be done on values of that size.

Others (e.g. Intel x64) are more flexible, and there's no single "natural" size; it's up to the compiler designers to choose a size, a compromise between efficiency, range of values, and memory usage.

why this is most efficient

If the processor requires values to be a particular size for arithmetic, then choosing another size will force you to convert the values to the required size - probably for a cost.

why a short must be converted before doing arithmetic operations on it

Presumably, that was a good match for the behaviour of commonly-used processors when C was developed, half a century ago. C++ inherited the promotion rules from C. I can't really comment on exactly why it was deemed a good idea, since I wasn't born then.

What happens when arithmetic operations are conducted on a long integer?

If the processor registers are large enough to hold a long, then the arithmetic will be much the same as for int. Otherwise, the operations will have to be broken down into several operations on values split between multiple registers.



回答4:

I understand that the 'natural size' is the width of integer that is processed most efficiently by a particular hardware.

That's an excellent start.

Q: What exactly determines this 'natural size'?

The paragraph above is the definition of "natural size". Nothing else determines it.

I want to understand why this is most efficient

By definition.

and why a short must be converted before doing arithmetic operations on it.

It is so because the C language definitions says so. There are no deep architectural reasons (there could have been some when C was invented).

Bonus Q: What happens when arithmetic operations are conducted on a long integer?

A bunch of electrons rushes through dirty sand and meets a bunch of holes. (No, really. Ask a vague question...)