Is there some way to use boost::obect_pool with fa

2019-06-14 20:05发布

问题:

I have been using boost object_pool for some time and was generally happy with the results. Previously I was mostly allocating individual objects but rarely freed them individually, just freed entire pool at once. Recently when I came across the need to free many objects from the pool I discovered that it is VERY slow.

Apparently pool is searching through the list of already released chunks to link the newly released object. The documentation talks about ordered and unordered pools and mentions pool_allocator as well as fast_pool_allocator. Unordered pool (using fast_memory_allocator) presumably would be much faster in releasing memory chunks. However, I can't see how I can make use of it.

Do I understand correctly that I have a choice between pool_allocator and fast_pool_allocator only in conjunction with boost::singleton_pool but not with boost::object_pool?

Below is a small test program illustrating the problem. It was built with VS2013 and boost 1_57_0. Test allocates n objects in object pool and then randomly releases 10%. It has some crude timing instrumentation which shows that for n == 100,000 allocation takes 0.004 sec while release takes 0.4 sec. Meanwhile for n == 1,000,000 it takes 0.022 sec to allocate and 42 sec to release on my machine.

#include <boost/pool/object_pool.hpp>
#include "time.h"
#include <vector>
#include <random>

struct foo {
    int data[10];
};

struct test {
    test(unsigned n) : size{ n } {}
    void run();
    float elapsedSec(clock_t& watch);
    unsigned size;
    boost::object_pool<foo> _pool;
    float mallocSec;
    float freeSec;
};

void test::run() {
    std::vector<foo *> foos(size, nullptr);
    std::default_random_engine generator;
    std::uniform_int_distribution<int> distribution(0, size - 1);
    auto dice = std::bind(distribution, generator);
    clock_t watch = clock();
    for (int i = 0; i < size; ++i)
        foos[i] = _pool.malloc();
    mallocSec = elapsedSec(watch);
    for (int i = 0; i < size / 10;) {
        auto idx = dice();
        if (foos[idx] == nullptr)
            continue;
        _pool.free(foos[idx]);
        foos[idx] = nullptr;
        i += 1;
    }
    freeSec = elapsedSec(watch);
}

float test::elapsedSec(clock_t& watch) {    
    clock_t start = watch;
    watch = clock();
    return (watch - start) / static_cast<float>(CLOCKS_PER_SEC);   
}

回答1:

The free call hits ordered_free on the underlying allocator. Indeed, ordered_malloc_need_resize() takes basically all the execution time.

Do I understand correctly that I have a choice between pool_allocator and fast_pool_allocator only in conjunction with boost::singleton_pool but not with boost::object_pool?

Yes. The object_pool manifestly uses the ordered_free function on the underlying simple_segregated_storage. This is clearly by design (although the rationale escapes me for the moment. Apparently in the intended usage it makes sense for object_pool to always optimize for array allocations/de-allocations).

The way I understand things now pool_allocator and fast_pool_allocations are classes that stand on their own and are not related to arguments/options of singleton_pool.

Yes. They are hardwired to use the singleton pools instances. Boost Pool clearly predates standard library support for stateful allocators. You could copy the implementation of fast_pool_allocator out to use a runtime instance of pool instead of the singleton pool.

The following sample makes non_boost::fast_pool_allocator a stateful allocator on top of a specific "Object Use" pool instance. That makes tha allocator stateful. The state is mainly a pointer to the pool.

_alloc.destroy is used to destruct the foo instance and free the memory. Any unfreed elements will still be freed upon destruction of the _pool (Note since we don't use object_pool, no destructors for foo will be run in such a case. In your sample, foo is POS and therefore trivially destructible. If not, you can of course use a std::unique_ptr or similar, or indeed write a version of object_pool that doesn't insist on ordered allocation).

DEMO

Live On Coliru

#include <boost/pool/pool.hpp>
#include <boost/pool/object_pool.hpp>
#include <boost/pool/pool_alloc.hpp>
#include "time.h"
#include <vector>
#include <random>

struct foo {
    int data[10];
};

namespace non_boost {
    template <typename T, typename UserAllocator = boost::default_user_allocator_new_delete>
    class fast_pool_allocator
    {
      public:
        typedef T value_type;
        typedef UserAllocator user_allocator;

        typedef value_type * pointer;
        typedef const value_type * const_pointer;
        typedef value_type & reference;
        typedef const value_type & const_reference;
        typedef boost::pool<UserAllocator> pool_type;
        typedef typename pool_type::size_type       size_type;
        typedef typename pool_type::difference_type difference_type;

        template <typename U>
        struct rebind {
            typedef fast_pool_allocator<U, UserAllocator> other;
        };

        pool_type* _ref;
      public:
        fast_pool_allocator(pool_type& ref) : _ref(&ref) { }

        fast_pool_allocator(fast_pool_allocator const&) = default;
        fast_pool_allocator& operator=(fast_pool_allocator const&) = default;

        // Not explicit, mimicking std::allocator [20.4.1]
        template <typename U>
        fast_pool_allocator(const fast_pool_allocator<U, UserAllocator> & other) : _ref(other._ref)
        { }

        // Default destructor used.
        static pointer address(reference r)                     { return &r;                                      } 
        static const_pointer address(const_reference s)         { return &s;                                      } 
        static size_type max_size()                             { return (std::numeric_limits<size_type>::max)(); } 
        void construct(const pointer ptr, const value_type & t) { new (ptr) T(t);                                 } 
        void destroy(const pointer ptr)                         { ptr->~T();                                      } 

        bool operator==(fast_pool_allocator const& rhs) const { return _ref == rhs._ref; }
        bool operator!=(fast_pool_allocator const& rhs) const { return _ref != rhs._ref; }

        pointer allocate(const size_type n)
        {
            const pointer ret = (n == 1) 
                ? static_cast<pointer>( (_ref->malloc)() ) 
                : static_cast<pointer>( _ref->ordered_malloc(n) );
            if (ret == 0)
                boost::throw_exception(std::bad_alloc());
            return ret;
        }

        pointer allocate(const size_type n, const void * const) { return allocate(n); }
        pointer allocate()
        {
            const pointer ret = static_cast<pointer>( (_ref->malloc)() );
            if (ret == 0)
                boost::throw_exception(std::bad_alloc());
            return ret;
        }
        void deallocate(const pointer ptr, const size_type n)
        {

#ifdef BOOST_NO_PROPER_STL_DEALLOCATE
            if (ptr == 0 || n == 0)
                return;
#endif
            if (n == 1)
                (_ref->free)(ptr);
            else
                (_ref->free)(ptr, n);
        }
        void deallocate(const pointer ptr) { (_ref->free)(ptr); }
    };

    //Specialization of fast_pool_allocator<void> required to make the allocator standard-conforming.
    template<typename UserAllocator>
    class fast_pool_allocator<void, UserAllocator> {
    public:
        typedef void*       pointer;
        typedef const void* const_pointer;
        typedef void        value_type;

        template <class U> struct rebind {
            typedef fast_pool_allocator<U, UserAllocator> other;
        };
    };

}

struct test {
    test(unsigned n) : size{ n } {}
    void run();
    float elapsedSec(clock_t& watch);
    unsigned size;

    boost::pool<boost::default_user_allocator_malloc_free> _pool { sizeof(foo) };
    non_boost::fast_pool_allocator<foo, boost::default_user_allocator_malloc_free> _alloc { _pool };

    float mallocSec;
    float freeSec;
};

void test::run() {
    std::vector<foo *> foos(size, nullptr);
    std::default_random_engine generator;
    std::uniform_int_distribution<int> distribution(0, size - 1);

    auto dice = std::bind(distribution, generator);
    clock_t watch = clock();

    for (unsigned i = 0; i < size; ++i)
         foos[i] = _alloc.allocate();

    mallocSec = elapsedSec(watch);

    for (unsigned i = 0; i < size / 10;) {
        auto idx = dice();
        if (foos[idx] != nullptr)
        {
            _alloc.destroy(foos[idx]);
            foos[idx] = nullptr;
        }
        i += 1;
    }

    freeSec = elapsedSec(watch);
}

float test::elapsedSec(clock_t& watch) {    
    clock_t start = watch;
    watch = clock();
    return (watch - start) / static_cast<float>(CLOCKS_PER_SEC);   
}

int main() {
    test t(10u << 20);
    t.run();

    std::cout << t.mallocSec << "\n";
    std::cout << t.freeSec   << "\n";
}

Prints (on my system):

0.135127
0.050991