What happens when I assign long int to int in C?

2019-01-11 00:56发布

问题:

In a recent homework assignment I've been told to use long variable to store a result, since it may be a big number.

I decided to check will it really matter for me, on my system (intel core i5/64-bit windows 7/gnu gcc compiler) and found out that the following code:

printf("sizeof(char) => %d\n", sizeof(char));
printf("sizeof(short) => %d\n", sizeof(short));
printf("sizeof(short int) => %d\n", sizeof(short int));
printf("sizeof(int) => %d\n", sizeof(int));
printf("sizeof(long) => %d\n", sizeof(long));
printf("sizeof(long int) => %d\n", sizeof(long int));
printf("sizeof(long long) => %d\n", sizeof(long long));
printf("sizeof(long long int) => %d\n", sizeof(long long int));

produces the following output:

sizeof(char) => 1
sizeof(short) => 2
sizeof(short int) => 2
sizeof(int) => 4
sizeof(long) => 4
sizeof(long int) => 4
sizeof(long long) => 8
sizeof(long long int) => 8

In other words, on my system, int and long are the same, and whatever will be too big for int to hold, will be too big for long to hold as well.

The homework assignment itself is not the issue here. I wonder how, on a system where int < long, should I assign an int to long?

I'm aware to the fact that there are numerous closely related questions on this subject, but I feel that the answers within these do not provide me with the complete understanding of what will or may happen in the process.

Basically I'm trying to figure out the following:

  1. Should I cast long to int before the assignment, or since long is not a different data type, but merely a modifier, it will be considered unharmful to assign directly?
  2. What happens on systems where long > int? Will the result be undefined (or unpredictable) or it will cause the extra parts of the variable to be omitted?
  3. How does the casting from long to int works in C?
  4. How does the assignment from long to int works in C when I don't use casting?

回答1:

The language guarantees that int is at least 16 bits, long is at least 32 bits, and long can represent at least all the values that int can represent.

If you assign a long value to an int object, it will be implicitly converted. There's no need for an explicit cast; it would merely specify the same conversion that's going to happen anyway.

On your system, where int and long happen to have the same size and range, the conversion is trivial; it simply copies the value.

On a system where long is wider than int, if the value won't fit in an int, then the result of the conversion is implementation-defined. (Or, starting in C99, it can raise an implementation-defined signal, but I don't know of any compilers that actually do that.) What typically happens is that the high-order bits are discarded, but you shouldn't depend on that. (The rules are different for unsigned types; the result of converting a signed or unsigned integer to an unsigned type is well defined.)

If you need to safely assign a long value to an int object, you can check that it will fit before doing the assignment:

#include <limits.h> /* for INT_MIN, INT_MAX */

/* ... */

int i;
long li = /* whatever */

if (li >= INT_MIN && li <= INT_MAX) {
    i = li;
}
else {
    /* do something else? */
}

The details of "something else" are going to depend on what you want to do.

One correction: int and long are always distinct types, even if they happen to have the same size and representation. Arithmetic types are freely convertible, so this often doesn't make any difference, but for example int* and long* are distinct and incompatible types; you can't assign a long* to an int*, or vice versa, without an explicit (and potentially dangerous) cast.

And if you find yourself needing to convert a long value to int, the first thing you should do is reconsider your code's design. Sometimes such conversions are necessary, but more often they're a sign that the int to which you're assigning should have been defined as a long in the first place.



回答2:

A long can always represent all values of int. If the value at hand can be represented by the type of the variable you assign to, then the value is preserved.

If it can't be represented, then for signed destination type the result is formally unspecified, while for unsigned destination type it is specified as the original value modulo 2n, where n is the number of bits in the value representation (which is not necessarily all the bits in the destination).

In practice, on modern machines you get wrapping also for signed types.

That's because modern machines use two's complement form to represent signed integers, without any bits used to denote "invalid value" or such – i.e., all bits used for value representation.

With n bits value representation any integer value is x is mapped to x+K*2n with the integer constant K chosen such that the result is in the range where half of the possible values are negative.

Thus, for example, with 32-bit int the value -7 is represented as bitpattern number -7+232 = 232-7, so that if you display the number that the bitpattern stands for as unsigned integer, you get a pretty large number.

The reason that this is called two's complement is because it makes sense for the binary numeral system, the base two numeral system. For the binary numeral system there's also a ones' (note the placement of the apostrophe) complement. Similarly, for the decimal numberal system there's ten's complement and niners' complement. With 4 digit ten's complement representation you would represent -7 as 10000-7 = 9993. That's all, really.