Wrapping dynamic array into STL/Boost container?

2019-02-03 14:42发布

问题:

I need to wrap a dynamically allocated array(from a = new double[100] for example) into std::vector(preferably) without copying the array. This restriction is imposed by that the array I want to wrap is mmaped from a file, so just doing vector(a, a+size) will double the memory usage.

Is any tricks to do that?

回答1:

One of the best solutions for this is something like STLSoft's array_proxy<> template. Unfortunately, the doc page generated from the source code by doxygen isn't a whole lot of help understanding the template. The source code might actually be a bit better:

  • http://www.stlsoft.org/doc-1.9/array__proxy_8hpp-source.html

The array_proxy<> template is described nicely in Matthew Wilson's book, Imperfect C++. The version I've used is a cut-down version of what's on the STLSoft site so I didn't have to pull in the whole library. My version's not as portable, but that makes it much simpler than what's on STLSoft (which jumps through a whole lot of portability hoops).

If you set up a variable like so:

int myArray[100];

array_proxy<int> myArrayProx( myArray);

The variable myArrayProx has many of the STL interfaces - begin(), end(), size(), iterators, etc.

So in many ways, the array_proxy<> object behaves just like a vector (though push_back() isn't there since the array_proxy<> can't grow - it doesn't manage the array's memory, it just wraps it in something a little closer to a vector).

One really nice thing with array_proxy<> is that if you use them as function parameter types, the function can determine the size of the array passed in, which isn't true of native arrays. And the size of the wrapped array isn't part of the template's type, so it's quite flexible to use.



回答2:

A boost::iterator_range provides a container-like interface:

// Memory map an array of doubles:
size_t number_of_doubles_to_map = 100;
double* from_mmap = mmap_n_doubles(number_of_doubles_to_map);

// Wrap that in an iterator_range
typedef boost::iterator_range<double*> MappedDoubles;
MappedDoubles mapped(from_mmap, from_mmap + number_of_doubles_to_map);

// Use the range
MappedDoubles::iterator b = mapped.begin();
MappedDoubles::iterator e = mapped.end();
mapped[0] = 1.1;
double first = mapped(0);

if (mapped.empty()){
    std::cout << "empty";
}
else{
    std::cout << "We have " << mapped.size() << "elements. Here they are:\n"
       << mapped;
}


回答3:

I was once determined to accomplish the exact same thing. After a few days of thinking and trying I decided it wasn't worth it. I ended up creating my own custom vector that behaved like std::vector's but only had the functionality I actually needed like bound checking, iterators etc.

If you still desire to use std::vector, the only way I could think of back then was to create a custom allocator. I've never written one but seeing as this is the only way to control STL's memory management maybe there is something that can be done there.



回答4:

No, that is not possible using a std::vector.

But if possible you can create the vector with this size, and possible map the file to that instead.

std::vector<double> v(100);
mmapfile_double(&v[0], 100);


回答5:

What about vector of pointers that point to your mapped area elements (reduced memory consumption as sizeof(double*) < sizeof(double))? Is this OK for you?

There is some drawbacks (primary is you need special predicates for sort) but some benefits too as you can, for example, delete elements without changing actual mapped content (or have even number of such arrays with different order of elements without any change to actual values).

There is common problem of all the solutions with std::vector on mapped file: to 'nail' vector content to mapped area. This can't be tracked, you can only watch after yourself to not use something which could lead to vector content re-allocation. So be careful in any case.



回答6:

You could go with array_proxy<>, or take a look at Boost.Array . It gives you size(), front(), back(), at(), operator[], etc. Personally, I'd prefer Boost.Array since Boost is more prevalent anyway.



回答7:

well, the vector template allows to provide your own memory allocator. I never did it myself but I guess it is not that difficult to get it to point to your array, maybe with placement new operator... just a guess, I write more if I try and succeed.



回答8:

Here's the solution to your question. I had been attempting this off and on for quite some time before I came up with a workable solution. The caveat is that you have got to zero out the pointers after use in order to avoid double-freeing the memory.

#include <vector>
#include <iostream>

template <class T>
void wrapArrayInVector( T *sourceArray, size_t arraySize, std::vector<T, std::allocator<T> > &targetVector ) {
  typename std::_Vector_base<T, std::allocator<T> >::_Vector_impl *vectorPtr =
    (typename std::_Vector_base<T, std::allocator<T> >::_Vector_impl *)((void *) &targetVector);
  vectorPtr->_M_start = sourceArray;
  vectorPtr->_M_finish = vectorPtr->_M_end_of_storage = vectorPtr->_M_start + arraySize;
}

template <class T>
void releaseVectorWrapper( std::vector<T, std::allocator<T> > &targetVector ) {
  typename std::_Vector_base<T, std::allocator<T> >::_Vector_impl *vectorPtr =
        (typename std::_Vector_base<T, std::allocator<T> >::_Vector_impl *)((void *) &targetVector);
  vectorPtr->_M_start = vectorPtr->_M_finish = vectorPtr->_M_end_of_storage = NULL;
}

int main() {

  int tests[6] = { 1, 2, 3, 6, 5, 4 };
  std::vector<int> targetVector;
  wrapArrayInVector( tests, 6, targetVector);

  std::cout << std::hex << &tests[0] << ": " << std::dec
            << tests[1] << " " << tests[3] << " " << tests[5] << std::endl;

  std::cout << std::hex << &targetVector[0] << ": " << std::dec
            << targetVector[1] << " " << targetVector[3] << " " << targetVector[5] << std::endl;

  releaseVectorWrapper( targetVector );
}

Alternatively you could just make a class that inherits from vector and nulls out the pointers upon destruction:

template <class T>
class vectorWrapper : public std::vector<T>
{   
public:
  vectorWrapper() {
    this->_M_impl _M_start = this->_M_impl _M_finish = this->_M_impl _M_end_of_storage = NULL;
  }   

  vectorWrapper(T* sourceArray, int arraySize)
  {   
    this->_M_impl _M_start = sourceArray;
    this->_M_impl _M_finish = this->_M_impl _M_end_of_storage = sourceArray + arraySize;
  }   

  ~vectorWrapper() {
    this->_M_impl _M_start = this->_M_impl _M_finish = this->_M_impl _M_end_of_storage = NULL;
  }   

  void wrapArray(T* sourceArray, int arraySize)
  {   
    this->_M_impl _M_start = sourceArray;
    this->_M_impl _M_finish = this->_M_impl _M_end_of_storage = sourceArray + arraySize;
  }   
};