How to use this macro to test if memory is aligned

2019-02-13 11:27发布

问题:

I'm a simd beginner, I've read this article about the topic (since I'm using a AVX2-compatible machine).

Now, I've read in this question to check if your pointer is aligned.

I'm testing it with this toy example main.cpp:

#include <iostream>
#include <immintrin.h>

#define is_aligned(POINTER, BYTE_COUNT) \
    (((uintptr_t)(const void *)(POINTER)) % (BYTE_COUNT) == 0)


int main()
{
  float a[8];
  for(int i=0; i<8; i++){
    a[i]=i;
  }
  __m256 evens = _mm256_set_ps(2.0, 4.0, 6.0, 8.0, 10.0, 12.0, 14.0, 16.0);
  std::cout<<is_aligned(a, 16)<<" "<<is_aligned(&evens, 16)<<std::endl;   
  std::cout<<is_aligned(a, 32)<<" "<<is_aligned(&evens, 32)<<std::endl;   

}

And compile it with icpc -std=c++11 -o main main.cpp.

The resulting printing is:

1 1
1 1

However, if I add thhese 3 lines before the 4 prints:

for(int i=0; i<8; i++)
  std::cout<<a[i]<<" ";
std::cout<<std::endl;

This is the result:

0 1 2 3 4 5 6 7 
1 1
0 1

In particular, I don't understand that last 0. Why is it different from the last printing? What am I missing?

回答1:

Your is_aligned (which is a macro, not a function) determines whether the object has been aligned to particular boundary. It does not determine the alignment requirement of the type of the object.

The compiler will guarantee for a float array, that it be aligned to at least the alignment requirement of a float - which is typically 4. 32 is not a factor of 4, so there is no guarantee that the array be aligned to 32 byte boundary. However, there are many memory addresses that are divisible by both 4 and 32, so it is possible that a memory address at a 4 byte boundary happens to also be at a 32 byte boundary. This is what happened in your first test, but as explained, there is no guarantee that it would happen. In your latter test you added some local variables, and the array ended up in another memory location. It so happened that the other memory location wasn't at the 32 byte boundary.

To request a stricter alignment that may be required by SIMD instructions, you can use the alignas specifier:

alignas(32) float a[8];