"Expand" font size in SSE register
I am using VS2005 (at work) and need a built in SSE that does the following:
I have an already existing __m128i
n filled with 16 bit integers a_1,a_2,....,a_8
.
Since some of the computation I now want to do requires 32 instead of 16 bits, I want to extract two four sets of 16-bit integers from n and put them in two separated ones __m128i
that contain a_1,...,a_4
and a_5,...,a_8
respectively.
I could do this manually using various _mm_set
intrinsics, but that would result in eight mov
per build, and I was hoping there would be a faster way to do it.
source to share
Assuming I understand correctly what you want to achieve (unpack 8 x 16 bits into one vector into two vectors of 4 x 32 bit ints), I usually do it like in SSE2 and later:
__mm128i v = _mm_set_epi16(7, 6, 5, 4, 3, 2, 1, 0); // v = { 7, 6, 5, 4, 3, 2, 1, 0 }
__mm128i v_lo = _mm_srai_epi32(_mm_unpacklo_epi16(v, v), 16); // v_lo = { 3, 2, 1, 0 }
__mm128i v_hi = _mm_srai_epi32(_mm_unpackhi_epi16(v, v), 16); // v_hi = { 7, 6, 5, 4 }
source to share