I've been working with SSE for a while now, and I've seen my share of alignment issues. This, however, is beyond my understanding:
I get different alignment whether I run the program using F5 (debug) or whether I run it outside the debugger (Ctrl+F5)!
Some background info:
I'm using a wrapper for a SSE-enabled datatype - with overloaded operators and custom allocator (overloadednew
and delete
operators using _mm_malloc
and _mm_free
). But in the example below, I've managed to reduce to problem even further, i.e. the issue also happens even if I don't use the custom allocator.
As you can see below, in main() I dynamically allocate a TestClass object on the heap, which contains a SSEVector type object. I'm using a dummy float[2]
member variable to "missalign" the stack a bit.
I obtain the following output when I run with F5:
object address 00346678
_memberVariable1 address 00346678
_sseVector address 00346688
And if I run with Ctrl+F5:
object address 00345B70
_memberVariable1 address 00345B70
_sseVector address 00345B80
As you can see, the alignment is different (i.e. not 16-byte) when I run it in the debugger. Is it just a coincidence that the alignment is correct when using Ctrl-F5? I'm using Visual Studio 2010 with a new project (default settings).
If I declare the object on the stack, i.e. TestClass myObject;
, this issue does not appear. Using __declspec(align(16))
does not help, either.
The code I used to reproduce the issue:
#include <iostream>
#include <string>
#include <xmmintrin.h> // SSE
//#include "DynAlignedAllocator.h"
//////////////////////////////////////////////////////////////
class SSEVector /*: public DynAlignedAllocator<16>*/
{
public:
SSEVector() { }
__m128 vec;
};
class TestClass
{
public:
TestClass() { }
/*__declspec(align(16))*/ float _memberVariable1 [2];
SSEVector _sseVector;
};
//////////////////////////////////////////////////////////////
int main (void)
{
TestClass* myObject = new TestClass;
std::cout << "object address " << myObject << std::endl;
std::cout << "_memberVariable1 address " << &(myObject->_memberVariable1) << std::endl;
std::cout << "_sseVector address " << &(myObject->_sseVector) << std::endl;
delete myObject;
// wait for ENTER
std::string dummy;
std::getline(std::cin, dummy);
return 0;
}
Any hints or commentaries are greatly appreciated. Thanks in advance.