In this answer, zwol made this claim:
The correct way to convert two bytes of data from an external source into a 16-bit signed integer is with helper functions like this:
#include <stdint.h>
int16_t be16_to_cpu_signed(const uint8_t data[static 2]) {
uint32_t val = (((uint32_t)data[0]) << 8) |
(((uint32_t)data[1]) << 0);
return ((int32_t) val) - 0x10000u;
}
int16_t le16_to_cpu_signed(const uint8_t data[static 2]) {
uint32_t val = (((uint32_t)data[0]) << 0) |
(((uint32_t)data[1]) << 8);
return ((int32_t) val) - 0x10000u;
}
Which of the above functions is appropriate depends on whether the array contains a little endian or a big endian representation. Endianness is not the issue at question here, I am wondering why zwol subtracts 0x10000u
from the uint32_t
value converted to int32_t
.
Why is this the correct way?
How does it avoid the implementation defined behavior when converting to the return type?
Since you can assume 2's complement representation, how would this simpler cast fail: return (uint16_t)val;
What is wrong with this naive solution:
int16_t le16_to_cpu_signed(const uint8_t data[static 2]) {
return (uint16_t)data[0] | ((uint16_t)data[1] << 8);
}
This should be pedantically correct and work also on platforms that use sign bit or 1's complement representations, instead of the usual 2's complement. The input bytes are assumed to be in 2's complement.
Because of the branch, it will be more expensive than other options.
What this accomplishes is that it avoids any assumption on how
int
representation relates tounsigned
representation on the platform. The cast toint
is required to preserve arithmetic value for any number that will fit in target type. Because the inversion ensures top bit of 16-bit number will be zero, the value will fit. Then the unary-
and subtraction of 1 apply the usual rule for 2's complement negation. Depending on platform,INT16_MIN
could still overflow if it doesn't fit in theint
type on the target, in which caselong
should be used.The difference to the original version in the question comes at the return time. While the original just always subtracted
0x10000
and 2's complement let signed overflow wrap it toint16_t
range, this version has the explicitif
that avoids signed wrapover (which is undefined).Now in practice, almost all platforms in use today use 2's complement representation. In fact, if the platform has standard-compliant
stdint.h
that definesint32_t
, it must use 2's complement for it. Where this approach sometimes comes handy is with some scripting languages that don't have integer data types at all - you can modify the operations shown above for floats and it will give the correct result.Another method - using
union
:In program:
first_byte
andsecond_byte
can be swapped according to little or big endian model. This method is not better but is one of alternatives.Here is another version that relies only on portable and well-defined behaviours (header
#include <endian.h>
is not standard, the code is):The little-endian version compiles to single
movbe
instruction withclang
,gcc
version is less optimal, see assembly.The arithmetic operators shift and bitwise-or in expression
(uint16_t)data[0] | ((uint16_t)data[1] << 8)
don't work on types smaller thanint
, so that thoseuint16_t
values get promoted toint
(orunsigned
ifsizeof(uint16_t) == sizeof(int)
). Still though, that should yield the correct answer, since only the lower 2 bytes contain the value.Another pedantically correct version for big-endian to little-endian conversion (assuming little-endian CPU) is:
memcpy
is used to copy the representation ofint16_t
and that is the standard-compliant way to do so. This version also compiles into 1 instructionmovbe
, see assembly.I want to thank all contributors for theirs answers. Here is what the collective works boils down to:
uint8_t
,int16_t
anduint16_t
must use two's complement representation without any padding bits, so the actual bits of the representation are unambiguously those of the 2 bytes in the array, in the order specified by the function names.(unsigned)data[0] | ((unsigned)data[1] << 8)
(for the little endian version) compiles to a single instruction and yields an unsigned 16-bit value.uint16_t
to signed typeint16_t
has implementation defined behavior if the value is not in the range of the destination type. No special provision is made for types whose representation is precisely defined.INT_MAX
and compute the corresponding signed value by subtracting0x10000
. Doing this for all values as suggested by zwol may produce values outside the range ofint16_t
with the same implementation defined behavior.0x8000
bit explicitly causes the compilers to produce inefficient code.memcpy
.Combining points 2 and 7, here is a portable and fully defined solution that compiles efficiently to a single instruction with both gcc and clang:
64-bit Assembly:
If
int
is 16-bit then your version relies on implementation-defined behaviour if the value of the expression in thereturn
statement is out of range forint16_t
.However the first version also has a similar problem; for example if
int32_t
is a typedef forint
, and the input bytes are both0xFF
, then the result of the subtraction in the return statement isUINT_MAX
which causes implementation-defined behaviour when converted toint16_t
.IMHO the answer you link to has several major issues .