I want to add 00
and 01
indices value of int64x2_t
vector in neon .
I am not able to find any pairwise-add instruction which will do this functionality .
int64x2_t sum_64_2;
//I am expecting result should be..
//int64_t result = sum_64_2[0] + sum_64_2[1];
- Is there any instruction in neon do to this logic.
You can write it in two ways. This one explicitly uses the NEON VADD.I64
instruction:
int64x1_t f(int64x2_t v)
{
return vadd_s64 (vget_high_s64 (v), vget_low_s64 (v));
}
and the following one relies on the compiler to correctly select between using the NEON and general integer instruction sets. GCC 4.9 does the right thing in this case, but other compilers may not.
int64x1_t g(int64x2_t v)
{
int64x1_t r;
r=vset_lane_s64(vgetq_lane_s64(v, 0) + vgetq_lane_s64(v, 1), r, 0);
return r;
}
When targeting ARM, the code generation is efficient. For AArch64, extra instructions are used, but the compiler could do better.