How to write cache friendly polymorphic code in C+

2019-04-21 15:16发布

I am writing a piece of code with high demands on performance where I need to handle a large number of objects in a polymorphic way. Let's say I have a class A and a class B which is derived from A. I could now create a vector of B:s like this

vector<A*> a(n);
for(int i = 0; i < n; i++)
  a[i] = new B();

but if n is large (in the order 10^6 or more in my case) this would require very many calls to new and moreover the n objects could potentially be spread out all over my main memory resulting in very poor cache performance. What would be the right way to deal with this situation? I am thinking of doing something like the following to have all the objects in a contiguous memory region.

B* b = new B[n];
vector<A*> a(n);
for(int i = 0; i < n; i++)
  a[i] = b + i;

but one problem is how to free up the memory allocated by new B[n] if b is not available anymore (but we still have a). I have just learnt that trying

delete[] a[0];

is not a good idea...

4条回答
男人必须洒脱
2楼-- · 2019-04-21 16:03

You just have to store the pointers returned from new[] somewhere, and delete[] them later. Another vector, for example.

查看更多
不美不萌又怎样
3楼-- · 2019-04-21 16:10

You can use placement new to construct an object at a particular memory location:

vector<A*> a(n);
for(int i = 0; i < n; i++)
  a[i] = new(storage + i*object_size) B();
  // and invoke the destructor manually to release the object (assuming A has a virtual destructor!)
  a[i]->~A(); 

But you cannot solve the 'real' problem without giving up the continuous storage: if one object is freed, it will cause a hole in the heap, thus causing high fragmentation over time. You could only keep track of the freed objects and re-use the storage.

查看更多
时光不老,我们不散
4楼-- · 2019-04-21 16:15

If you know for sure that those will only be objects of type B why not use a parallel vector:

vector<B> storage(n);
vector<A*> pointers(n);
for(int i = 0; i < n; i++)
   pointers[i] = &storage[i];
查看更多
Explosion°爆炸
5楼-- · 2019-04-21 16:16

If you want to keep all your objects in contiguous memory and, at the same time, avoid using an indirection (a vector of base class pointers), you can use a union-style container, e.g. a vector boost::variant. This will, of course, assume that there is a limited and known number of derived classes and that their sizes are comparable. The drawback is that each element of the vector will take as much memory as the biggest derived class, regardless of their actual class (and it also assumes your classes are reasonable cheap to copy). The advantages are that you have contiguous heterogeneous storage of the polymorphic objects, type-safety, and no indirection when accessing elements. Here is a basic example with boost::variant and three classes A, B, C, where both B and C inherit from A, and they are all polymorphic (and, of course, this could be much nicer with some sugar-coating and/or something more specialized for your purpose, not boost::variant which is not really appropriate for this purpose):

#include <iostream>
#include <vector>
#include <boost/variant/variant.hpp>
#include <boost/variant/apply_visitor.hpp>

struct A {
  int v1;
  A(int aV1 = 0) : v1(aV1) { };
  virtual ~A() { };
  virtual void print() { std::cout << "A print " << v1 << std::endl; };
  struct get_ref {
    typedef A& result_type;
    template <class T>
    A& operator()(T& obj) const { return obj; };
  };
};

struct B : A {
  float f1;
  B(float aF1 = 0.0) : A(42), f1(aF1) { };
  ~B() { };
  virtual void print() { std::cout << "B print " << f1 << std::endl; };
};

struct C : A {
  double d1;
  C(double aD1 = 0.0) : A(42), d1(aD1) { };
  ~C() { };
  virtual void print() { std::cout << "C print " << d1 << std::endl; };
};

int main() {
  typedef boost::variant<A,B,C> value_type;
  typedef std::vector< value_type > vect_type;
  vect_type arr(15);
  int i=0;
  for(;i<5;++i) arr[i] = A(31);
  for(;i<10;++i) arr[i] = B(0.2);
  for(;i<15;++i) arr[i] = C(0.4);

  for(vect_type::iterator it = arr.begin(); it != arr.end(); ++it)
    boost::apply_visitor(A::get_ref(), *it).print();

  std::cout << "value_type has size of " << sizeof(value_type) << std::endl;
  std::cout << "compared to: sizeof(A)=" << sizeof(A) << " sizeof(B)=" << sizeof(B) << " sizeof(C)=" << sizeof(C) << std::endl;

  return 0;
};
查看更多
登录 后发表回答