is there any way we can DE-interleave 32bpp image channels similar as below code in neon.
//Read all r,g,b,a pixels into 4 registers
uint8x8x4_t SrcPixels8x8x4= vld4_u8(inPixel32);
ChannelR1_32x4 = vmovl_u16(vget_low_u16(vmovl_u8(SrcPixels8x8x4.val[0]))),
channelR2_32x4 = vmovl_u16(vget_high_u16(vmovl_u8(SrcPixels8x8x4.val[0]))), vGaussElement_32x4_high);
basically i want all color channels in separate vectors with every vector has 4 elements of 32bits to do some calculation but i am not very familiar with SSE and could not find such instruction in SSE or if some one can provide better ways to do that? Any help is highly appreciated