Get member of __m128 by index?

2019-01-22 12:20发布

问题:

I've got some code, originally given to me by someone working with MSVC, and I'm trying to get it to work on Clang. Here's the function that I'm having trouble with:

float vectorGetByIndex( __m128 V, unsigned int i )
{
    assert( i <= 3 );
    return V.m128_f32[i];
}

The error I get is as follows:

Member reference has base type '__m128' is not a structure or union.

I've looked around and found that Clang (and maybe GCC) has a problem with treating __m128 as a struct or union. However I haven't managed to find a straight answer as to how I can get these values back. I've tried using the subscript operator and couldn't do that, and I've glanced around the huge list of SSE intrinsics functions and haven't yet found an appropriate one.

回答1:

A union is probably the most portable way to do this:

union {
    __m128 v;    // SSE 4 x float vector
    float a[4];  // scalar array of 4 floats
} U;

float vectorGetByIndex(__m128 V, unsigned int i)
{
    U u;

    assert(i <= 3);
    u.v = V;
    return u.a[i];
}


回答2:

Even if SSE4.1 is available and i is a compile time constant, you can't use pextract etc. this way:

// broken code starts here
template<unsigned i>
float vectorGetByIndex( __m128 V) {
    return _mm_extract_epi32(V, i);
}
// broken code ends here

I don't delete it because it is a useful reminder how to not do things and let it stand as a public humiliation.

Better use

template<unsigned i>
float vectorGetByIndex( __m128 V) {
    union {
        __m128 v;    
        float a[4];  
    } converter;
    converter.v = V;
    return converter.a[i];
}

which will work regardless of the available instruction set.



回答3:

As a modification to hirschhornsalz's solution, if i is a compile-time constant, you could avoid the union path entirely by using a shuffle/store:

template<unsigned i>
float vectorGetByIndex( __m128 V)
{
#ifdef __SSE4_1__
    return _mm_extract_epi32(V, i);
#else
    float ret;
    // shuffle V so that the element that you want is moved to the least-
    // significant element of the vector (V[0])
    V = _mm_shuffle_ps(V, V, _MM_SHUFFLE(i, i, i, i));
    // return the value in V[0]
    return _mm_cvtss_f32(V);
#endif
}


回答4:

The way I use is

union vec { __m128 sse, float f[4] };

float accessmember(__m128 v, int index)
{
    vec v.sse = v;
    return v.f[index];
}

Seems to work out pretty well for me.