If I want to process data in a std::vector
with SSE, I need 16 byte alignment. How can I achieve that? Do I need to write my own allocator? Or does the default allocator already align to 16 byte boundaries?
相关问题
- Sorting 3 numbers without branching [closed]
- How to compile C++ code in GDB?
- Why does const allow implicit conversion of refere
- thread_local variables initialization
- What uses more memory in c++? An 2 ints or 2 funct
相关文章
- Class layout in C++: Why are members sometimes ord
- How to mock methods return object with deleted cop
- Which is the best way to multiply a large and spar
- C++ default constructor does not initialize pointe
- Selecting only the first few characters in a strin
- What exactly do pointers store? (C++)
- Converting glm::lookat matrix to quaternion and ba
- What is the correct way to declare and use a FILE
You should use a custom allocator with
std::
containers, such asvector
. Can't remember who wrote the following one, but I used it for some time and it seems to work (you might have to change_aligned_malloc
to_mm_malloc
, depending on compiler/platform):Use it like this (change the 16 to another alignment, if needed):
This, however, only makes sure the memory block
std::vector
uses is 16-bytes aligned. Ifsizeof(T)
is not a multiple of 16, some of your elements will not be aligned. Depending on your data-type, this might be a non-issue. IfT
isint
(4 bytes), only load elements whose index is a multiple of 4. If it'sdouble
(8 bytes), only multiples of 2, etc.The real issue is if you use classes as
T
, in which case you will have to specify your alignment requirements in the class itself (again, depending on compiler, this might be different; the example is for GCC):We're almost done! If you use Visual C++ (at least, version 2010), you won't be able to use an
std::vector
with classes whose alignment you specified, because ofstd::vector::resize
.When compiling, if you get the following error:
You will have to hack your
stl::vector header
file:vector
header file [C:\Program Files\Microsoft Visual Studio 10.0\VC\include\vector]void resize( _Ty _Val )
method [line 870 on VC2010]void resize( const _Ty& _Val )
.Use
declspec(align(x,y))
as explained in vectorization tutorial for Intel, http://d3f8ykwhia686p.cloudfront.net/1live/intel/CompilerAutovectorizationGuide.pdfWrite your own allocator.
allocate
anddeallocate
are the important ones. Here is one example:Short Answer:
If
sizeof(T)*vector.size() > 16
then Yes.Assuming you vector uses normal allocators
Caveat: As long as
alignof(std::max_align_t) >= 16
as this is the max alignment.Long Answer:
Updated 25/Aug/2017 new standard n4659
If it is aligned for anything that is greater than 16 it is also aligned correctly for 16.
6.11 Alignment (Paragraph 4/5)
new and new[] return values that are aligned so that objects are correctly aligned for their size:
8.3.4 New (paragraph 17)
Note most systems have a maximum alignment. Dynamically allocated memory does not need to be aligned to a value greater than this.
6.11 Alignment (paragraph 2)
Thus as long as your vector memory allocated is greater than 16 bytes it will be correctly aligned on 16 byte boundaries.
Don't assume anything about STL containers. Their interface/behaviour is defined, but not what's behind them. If you need raw access, you'll have to write your own implementation that follows the rules you'd like to have.
The Standard mandates that
new
andnew[]
return data aligned for any data type, which should include SSE. Whether or not MSVC actually follows that rule is another question.