“Extend” data type size in SSE register

2019-05-11 02:28发布

问题:

I'm using VS2005 (at work) and need an SSE intrinsic that does the following:

I have a pre-existing __m128i n filled with 16 bit integers a_1,a_2,....,a_8.

Since some calculations that I now want to do require 32 instead of 16 bits, I want to extract the two four-sets of 16-bit integers from n and put them into two separated __m128is which contain a_1,...,a_4 and a_5,...,a_8 respectively.

I could do this manually using the various _mm_set intrinsics, but those would result in eight movs in assembly, and I'd hoped that there would be a faster way to do this.

回答1:

Assuming that I understand correctly what it that you want to achieve (unpack 8 x 16 bits in one vector into two vectors of 4 x 32 bit ints), I typically do it like this in SSE2 and later:

__mm128i v = _mm_set_epi16(7, 6, 5, 4, 3, 2, 1, 0);  // v = { 7, 6, 5, 4, 3, 2, 1, 0 }
__mm128i v_lo = _mm_srai_epi32(_mm_unpacklo_epi16(v, v), 16); // v_lo = { 3, 2, 1, 0 }
__mm128i v_hi = _mm_srai_epi32(_mm_unpackhi_epi16(v, v), 16); // v_hi = { 7, 6, 5, 4 }


标签: c sse simd