I need to shuffle a 16 bit unsigned integer in a way that the even indexes land in the lower byte, and the odd indexes land in the upper byte.
input:
fedcba98 76543210 (contiguously numbered)
output:
fdb97531 eca86420 (even and odd separated)
My code looks like this at the moment:
typedef unsigned short u16;
u16 segregate(u16 x)
{
u16 g = (x & 0x0001);
u16 h = (x & 0x0004) >> 1;
u16 i = (x & 0x0010) >> 2;
u16 j = (x & 0x0040) >> 3;
u16 k = (x & 0x0100) >> 4;
u16 l = (x & 0x0400) >> 5;
u16 m = (x & 0x1000) >> 6;
u16 n = (x & 0x4000) >> 7;
u16 o = (x & 0x0002) << 7;
u16 p = (x & 0x0008) << 6;
u16 q = (x & 0x0020) << 5;
u16 r = (x & 0x0080) << 4;
u16 s = (x & 0x0200) << 3;
u16 t = (x & 0x0800) << 2;
u16 u = (x & 0x2000) << 1;
u16 v = (x & 0x8000);
return g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v;
}
I wonder if there is a more elegant solution than simply extracting and shifting each individual bit?
Ah yes, lookup tables to the rescue :) You can even do it with a single table and one extra shift:
The table approach shown by others is the most portable version and is probably quite fast.
If you want to take advantage of special instruction sets there are some other options as well. For Intel Haswell and later for example the following approach can be used (requires the BMI2 instruction set extension):
your answer to the even and odd bits shuffle for 64 bits is not accurate. To extend the 16 bit solution to 64 bit solution, we need not only to extend the masks, but also cover the swapping interval from 1 all the way to 16:
Tables. But generate them at compile time!
live example
In favour of being short:
You could use a 256-byte table for each byte of your 16-bit number, crafted so that your even/odd condition is satisfied. Hand-code the table entries (or use the algorithm you already have) to create the tables, and then the shuffling will be done at compile time. That would essentially be a translation table concept.