I was wondering about a possible way to make memory layout of a class to be more effective in templated code. As far as I know, Standard mandates data members of a class to be laid out in memory on order of their declaration. There might be possible padding done by the compiler to align the data members adding unnecessarily to the size of the class. The idea is to re-arrange data members declarations at compile time to minimize such padding. I did some searching, but couldn't find any info (most of the time people discuss packing compiler directives, which is not quite the same as I see it).
First, please consider the following (trivial, but repetitive and ugly) code (same code on ideone.com) (questions are below the code, feel free to skip right down to them):
#include <iostream>
#include <cstdint>
namespace so
{
template <typename Ta, typename Tb, typename Tc, std::size_t =
((sizeof(Ta) >= sizeof(Tb)) && (sizeof(Tb) >= sizeof(Tc))) ? 10 :
((sizeof(Ta) >= sizeof(Tc)) && (sizeof(Tc) >= sizeof(Tb))) ? 11 :
((sizeof(Tb) >= sizeof(Ta)) && (sizeof(Ta) >= sizeof(Tc))) ? 20 :
((sizeof(Tb) >= sizeof(Tc)) && (sizeof(Tc) >= sizeof(Ta))) ? 21 :
((sizeof(Tc) >= sizeof(Ta)) && (sizeof(Ta) >= sizeof(Tb))) ? 30 :
((sizeof(Tc) >= sizeof(Tb)) && (sizeof(Tb) >= sizeof(Ta))) ? 31 : 0>
struct foo {};
template <typename Ta, typename Tb, typename Tc>
struct foo<Ta, Tb, Tc, 10>
{
Ta a;
Tb b;
Tc c;
foo(Ta _a, Tb _b, Tc _c) : a{_a}, b{_b}, c{_c} {}
};
template <typename Ta, typename Tb, typename Tc>
struct foo<Ta, Tb, Tc, 11>
{
Ta a;
Tc c;
Tb b;
foo(Ta _a, Tb _b, Tc _c) : a{_a}, c{_c}, b{_b} {}
};
template <typename Ta, typename Tb, typename Tc>
struct foo<Ta, Tb, Tc, 20>
{
Tb b;
Ta a;
Tc c;
foo(Ta _a, Tb _b, Tc _c) : b{_b}, a{_a}, c{_c} {}
};
template <typename Ta, typename Tb, typename Tc>
struct foo<Ta, Tb, Tc, 21>
{
Tb b;
Tc c;
Ta a;
foo(Ta _a, Tb _b, Tc _c) : b{_b}, c{_c}, a{_a} {}
};
template <typename Ta, typename Tb, typename Tc>
struct foo<Ta, Tb, Tc, 30>
{
Tc c;
Ta a;
Tb b;
foo(Ta _a, Tb _b, Tc _c) : c{_c}, a{_a}, b{_b} {}
};
template <typename Ta, typename Tb, typename Tc>
struct foo<Ta, Tb, Tc, 31>
{
Tc c;
Tb b;
Ta a;
foo(Ta _a, Tb _b, Tc _c) : c{_c}, b{_b}, a{_a} {}
};
template <typename Ta, typename Tb, typename Tc>
struct bar: public foo<Ta, Tb, Tc>
{
private:
using base = foo<Ta, Tb, Tc>;
public:
bar() = default;
using base::base;
};
template <typename Ta, typename Tb, typename Tc>
struct foobar
{
Ta a;
Tb b;
Tc c;
foobar() = default;
foobar(Ta _a, Tb _b, Tc _c) : a{_a}, b{_b}, c{_c} {}
};
} //namespace so
int main()
{
so::bar<std::uint16_t, std::uint32_t, std::uint16_t> bar{1, 2, 3};
so::foobar<std::uint16_t, std::uint32_t, std::uint16_t> foobar{1, 2, 3};
std::cout << sizeof(bar) << "\t" << sizeof(foobar) << std::endl;
std::cout << bar.a << " : " << bar.b << " : " << bar.c << std::endl;
std::cout << foobar.a << " : " << foobar.b << " : " << foobar.c << std::endl;
return (0);
}
Program output:
8 12
1 : 2 : 3
1 : 2 : 3
Questions:
- Is there some well-known, compiler-independent way of solving such thing (Boost, maybe)?
- If no, is there some compiler-specific directives that will do such thing automatically (without data misalignment as with GCC's
__atribute__((packed))
)? - Can this be done in a more generic way (possibly using variadic templates)?
Thanks in advance!
Take a look at this series of blog posts by R. Martinho Fernandes : http://flamingdangerzone.com/cxx11/2012/07/06/optimal-tuple-i.html.
It outlines optimal packing of tuples. You can use such a "packed" tuple as the data storage for you class, and provide accessors hiding the
get<0>()
style tuple element access.I believe I have a relatively simple variadic template solution.
The implementation will require a couple helpers though, so I will present it backward so you can get the gist of it first.
The main benefit here is that changing how to pack elements will only affect how
Indexed
is defined, so you can easily improve on the algorithm. For now, I will just provide an equivalent of your template.Disclaimer: the following code is untested, it may not even compile, let alone produce the correct result.
I. Generating indices.
The explanation can be found on the lounge, we can reuse it to generate a pack of pairs (type, index). To sort it we are going to use a MPL algorithm, so it is simpler to produce the pack as a MPL vector.
II. Sorting
In order to sort we are going to use the MPL sort algorithm, which operates on types.
III. Computing the Storage class
Now, we only need to compute the storage class which is done by extracting the first element of each pair. Interestingly, pattern matching can really assist here.
IV. Computing the Index solver
V. Putting it all together
And now we wire up everything:
Hint: I advise using a specialized namespace to hold all the helpers. Whilst they could be defined within the template, it is easier to define them outside since they do not depend on
Args...
, however you would need to isolate them to avoid clashes with other parts of your program.