Is there any reason not to use fixed width integer

2019-01-08 13:42发布

问题:

Assuming you're using a compiler that supports C99 (or even just stdint.h), is there any reason not to use fixed width integer types such as uint8_t?

One reason that I'm aware of is that it makes much more sense to use chars when dealing with characters instead of using (u)int8_ts, as mentioned in this question.

But if you are planning on storing a number, when would you want to use a type that you don't know how big it is? I.e. In what situation would you want to store a number in a unsigned short without knowing if it is 8, 16, or even 32 bits, instead of using a uint16t?

Following on from this, is it considered better practice to use fixed width integers, or to use the normal integer types and just never assume anything and use sizeof whereever you need to know how many bytes they are using?

回答1:

It's actually quite common to store a number without needing to know the exact size of the type. There are plenty of quantities in my programs that I can reasonably assume won't exceed 2 billion, or enforce that they don't. But that doesn't mean I need an exact 32 bit type to store them, any type that can count to at least 2 billion is fine by me.

If you're trying to write very portable code, you must bear in mind that the fixed-width types are all optional.

On a C99 implementation where CHAR_BIT is greater than 8 there is no int8_t. The standard forbids it to exist because it would have to have padding bits, and intN_t types are defined to have no padding bits (7.18.1.1/1). uint8_t therefore also forbidden because (thanks, ouah) an implementation is not permitted to define uint8_t without int8_t.

So, in very portable code, if you need a signed type capable of holding values up to 127 then you should use one of signed char, int, int_least8_t or int_fast8_t according to whether you want to ask the compiler to make it:

  • work in C89 (signed char or int)
  • avoid surprising integer promotions in arithmetic expressions (int)
  • small (int_least8_t or signed char)
  • fast (int_fast8_t or int)

The same goes for an unsigned type up to 255, with unsigned char, unsigned int, uint_least8_t and uint_fast8_t.

If you need modulo-256 arithmetic in very portable code, then you can either take the modulus yourself, mask bits, or play games with bitfields.

In practice, most people never need to write code that portable. At the moment CHAR_BIT > 8 only comes up on special-purpose hardware, and your general-purpose code won't get used on it. Of course that could change in future, but if it does I suspect that there is so much code that makes assumptions about Posix and/or Windows (both of which guarantee CHAR_BIT == 8), that dealing with your code's non-portability will be one small part of a big effort to port code to that new platform. Any such implementation is probably going to have to worry about how to connect to the internet (which deals in octets), long before it worries how to get your code up and running :-)

If you're assuming that CHAR_BIT == 8 anyway then I don't think there's any particular reason to avoid (u)int8_t other than if you want the code to work in C89. Even in C89 it's not that difficult to find or write a version of stdint.h for a particular implementation. But if you can easily write your code to only require that the type can hold 255, rather than requiring that it can't hold 256, then you might as well avoid the dependency on CHAR_BIT == 8.



回答2:

One issue that hasn't yet been mentioned is that while the use of fixed-size integer types will mean that the sizes of one's variables won't change if compilers use different sizes for int, long, and so forth, it won't necessarily guarantee that code will behave identically on machines with various integer sizes, even when the sizes are defined.

For example, given declaration uint32_t i;, the behavior of expression (i-1) > 5 when i is zero will vary depending upon whether a uint32_t is smaller than int. On systems where e.g. int is 64 bits (and uint32_t is something like long short), the variable i would get promoted to int; the subtraction and comparison would be performed as signed (-1 is less than 5). On systems where int is 32 bits, the subtraction and comparison would be performed as unsigned int (the subtraction would yield a really big number, which is greater than five).

I don't know how much code relies upon the fact that intermediate results of expressions involving unsigned types are required to wrap even in the absence of typecasts (IMHO, if wrapping behavior was desired, the programmer should have included a typecast) (uint32_t)(i-1) > 5) but the standard presently allows no leeway. I wonder what problems would be posed if a rule that at least permitted a compiler to promote operands to a longer integer type in the absence of typecasts or type coercions [e.g. given uint32_t i,j, an assignment like j = (i+=1) >> 1; would be required to chop off the overflow, as would j = (uint32_t)(i+1) >> 1;, but j = (i+1)>>1 would not]? Or, for that matter, how hard it would be for compiler manufacturers to guarantee that any integral-type expression whose intermediate results could all fit within the largest signed type and didn't involve right shifts by non-constant amounts, would yield the same results as if all calculations were performed on that type? It seems rather icky to me that on a machine where int is 32 bits:

  uint64_t a,b,c;
  ...
  a &= ~0x40000000;
  b &= ~0x80000000;
  c &= ~0x100000000;

clears one bit each of a and c, but clears the top 33 bits of b; most compilers will give no hint that anything is 'different' about the second expression.



回答3:

It is true that the width of a standard integer type may change from one platform to another but not its minimal width.

For example the C Standard specifies that an int is at least 16-bit and a long is at least 32-bit wide.

If you don't have some size constraint when storing your objects you can let this to the implementation. For example if your maximum signed value will fit in a 16-bit you can just use an int. You then let the implementation have the final word of what it is the natural int width for the architecture the implementation is targeting.



回答4:

You should only use the fixed width types when you make an assumption about the width.

uint8_t and unsigned char are the same on most platforms, but not on all. Using uint8_t emphasizes on the fact that you suppose an architecture with 8 bit char and wouldn't compile on others, so this is a feature.

Otherwise I'd use the "semantic" typedef such as size_t, uintptr_t, ptrdiff_t because they reflect much better what you have in mind with data. I almost never use the base types directly, int only for error returns, and I don't remember ever having used short.

Edit: After careful reading of C11 I conclude that uint8_t, if it exists, must be unsigned char and can't be just char even if that type is is unsigned. This comes from the requirement in 7.20.1 p1 that all intN_t and uintN_t must be the corresponding signed and unsigned types. The only such pair for character types are signed char and unsigned char.



回答5:

The code should reveal to the casual reader (and the programmer him/her self) what is important. Is it just some integer or unsigned integer or even signed integer. The same goes for size. Is it really important to the algorithm that some variable is by default 16 bit? Or is that just unnecessary micromanagement and a failed attempt to optimize?

This is what makes programming an art -- to show what's important.