How can I convert an integer
to a half precision float
(which is to be stored into an array unsigned char[2]
). The range to the input int will be from 1-65535. Precision is really not a concern.
I am doing something similar for converting to 16bit int
into an unsigned char[2]
, but I understand there is not half precision float
C++ datatype. Example of this below:
int16_t position16int = (int16_t)data;
memcpy(&dataArray, &position16int, 2);
It's a very straightforward thing, all the info you need is in Wikipedia.
Sample implementation:
Output (ideone):
I asked the question of how to convert 32-bit floating points to 16-bit floating point.
Float32 to Float16
So from that you could very easily convert the int to a float and then use the question above to create a 16-bit float. I would suggest this is probably much easier than going from int directly to 16-bit float. Effectively by converting to 32-bit float you have done most of the hardwork and then you just need to shift a few bits around.
Edit: Looking at Alexey's excellent answer I think its highly likely that using a hardware int to float conversion and then bit shifting it around is likely to be a fair bit faster than his method. Might be worth profiling both methods and comparing them.
If you are targeting to supported hardware, you can use intrinsics:
https://software.intel.com/en-us/articles/performance-benefits-of-half-precision-floats https://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/GUID-7679FF37-257B-4F90-8668-5B3AA62587AD.htm
Following @kbok question comment I have used the first part of this answer to get the half float and then to get the array: