Invalid write size of 4 with matrix class (using v

2019-08-02 20:00发布

问题:

I have a simple matrix class I've made for use with opengl (es 2.0, afaik there isn't any built in version for that particular version of opengl).

It is basically nothing more than a vector which is resized to contain 16 floats in the constructor, (I chose a vector over just a 16 float array for the = operator it had), and some convenience functions that are useful in the context of opengl. Anyway, one of these convenience structures loads the identity matrix into the vector, it is:

void Matrix::loadIdentity()
{
    matrix.resize(16, 0);
    for(int i = 0; i < 16; i++)
        matrix[i] = 0.0f;
    for(int i = 0; i < 4; i++) {
        matrix[i * 4 + i] = 1.0f;
    }
}

Where matrix is:

std::vector<float>

The line that says:

matrix.resize(16, 0);

is there to make sure that the size is actually 16 floats, although it's also done in the constructor, and I don't believe it will be needed ultimately, but right now to remove one possible bug, to make life easier for debugging.

Valgrind is telling me that this function, in particular this line:

matrix[i] = 0.0f;

is causing an invalid write size of 4, which sounds suspiciously like I'm somehow calling matrix[16] = 0.0f;. However, i should only go from 0 to 15. One interesting thing to note though is that valgrind also says that the memory in question that's getting written to, was allocated by the destructor of the matrix, and more specifically, the destructor of the vector.

Address 0x150f8280 is 0 bytes inside a block of size 64 free'd  1: operator delete(void*) in /build/buildd/valgrind-3.6.1/coregrind/m_replacemalloc/vg_replace_malloc.c:387
  2: __gnu_cxx::new_allocator&lt;float&gt;::deallocate(float*, unsigned long) in <a href="file:///usr/include/c++/4.5/ext/new_allocator.h:95" >/usr/include/c++/4.5/ext/new_allocator.h:95</a>
  3: std::_Vector_base&lt;float, std::allocator&lt;float&gt; &gt;::_M_deallocate(float*, unsigned long) in <a href="file:///usr/include/c++/4.5/bits/stl_vector.h:146" >/usr/include/c++/4.5/bits/stl_vector.h:146</a>
  4: std::_Vector_base&lt;float, std::allocator&lt;float&gt; &gt;::~_Vector_base() in <a href="file:///usr/include/c++/4.5/bits/stl_vector.h:132" >/usr/include/c++/4.5/bits/stl_vector.h:132</a>
  5: std::vector&lt;float, std::allocator&lt;float&gt; &gt;::~vector() in <a href="file:///usr/include/c++/4.5/bits/stl_vector.h:314" >/usr/include/c++/4.5/bits/stl_vector.h:314</a>
  6: Matrix::~Matrix() in <a href="file:///home/leif/MarbleMachine/core/matrix.h:6" >core/matrix.h:6</a>

This seems unlikely if it's just some sort of index out of range error, seeing as, afaik, a vector object is primarily a pointer to where the array is in memory (well, there's other stuff too, but the point I'm trying to make is that the vector object itself (where the destructor would be stored), seems unlikely that it would be contiguous in memory with where the actual array is stored. Also, on top of that, why would a destructor claim memory when it's whole point is to free up the memory an object uses, wouldn't that just create a memory leak? By the way, the matrix class doesn't have an explicit destructor, so just the implicit one is used.

Finally, I know that previous errors in valgrind can cause future errors, but the only previous errors is invalid read size and use of uninitiated variables inside of existing libraries (OpenGL and SDL), as such, I find it extremely unlikely that they are causing a heap corruption before this happens.

Thank you.

回答1:

It's pretty clear from the snippet of Valgrind report you posted that you are accessing dangling storage for a vector.

Such access could easily arise if one thread is iterating over a vector, while another thread has resized it. For example, this is unsafe:

for (vector<float>::iterator it = matrix.begin(); it != matrix.end(); ++it)
  *it = 0;

while in another thread:

matrix.resize(32, 0);

If you have mutators that can resize the vector, then you must serialize all readers with respect to them, e.g.

pthread_mutex_lock(&matrix_lock);
for (vector<float>::iterator it = matrix.begin(); it != matrix.end(); ++it)
  *it = 0;
pthread_mutex_unlock(&matrix_lock);

In another thread:

pthread_mutex_lock(&matrix_lock);
matrix.resize(32, 0);
pthread_mutex_unlock(&matrix_lock);

What you've described in your answer:

1 Graphics thread gets matrix.end() outside of lock [which is already a problem].
...
6 Graphics thread locks
7 Graphics thread gets matrix.begin() and starts iterating

appears to not be consistent with Valgrind error report, because it would have resulted in access past the end of the array, whereas Valgrind reports access at the beginning of the (now deleted/dangling array).

Bottom line: calling begin(), end(), and iteration must all happen within a lock that protects the container; otherwise your program behavior will be undefined.



回答2:

Thank you for the help, I think I finally figured it out. It actually turns out I didn't give you enough information to solve the problem, so before I give you the solution I found, let me give you some more information.

The app was a multi-threaded app, in each thread, it has a portion where it iterates over a loop of objects (the same objects), which contain the matrix in it. In one case a physics engine, and the other, rendering the objects. I did lock the data in both threads when appropriate, but it turns out that I got the end of the iterator before the lock.

So what I believe was happening was this:

  1. Graphics thread determined the end iterator of the vertex array.
  2. Physics thread locked the data.
  3. Physics thread came through and due to a collision, added new items to the thread.
  4. New array size would be greater then it's capacity, so it's capacity is doubled, thus moving all of the data, and invalidating all iterators. (I know they're always invalidated whenever data is added, but it has no ill side effects unless the location of the data changes, afaik).
  5. Physics thread unlocked data.
  6. Graphics thread locked data.
  7. Graphics thread found starting iterator with the proper location, but because the for loop compared it to the iterator found in item #1, it goes off the end of the array, flipping invalid data (related to setting up the item's matrix for drawing).

THank you everyone for your help.