I'm using VS2005 (at work) and need an SSE intrinsic that does the following:
I have a pre-existing __m128i
n filled with 16 bit integers a_1,a_2,....,a_8
.
Since some calculations that I now want to do require 32 instead of 16 bits, I want to extract the two four-sets of 16-bit integers from n and put them into two separated __m128i
s which contain a_1,...,a_4
and a_5,...,a_8
respectively.
I could do this manually using the various _mm_set
intrinsics, but those would result in eight mov
s in assembly, and I'd hoped that there would be a faster way to do this.