Segmentation fault in malloc_consolidate (malloc.c

2019-02-06 08:47发布

问题:

This question already has an answer here:

  • Segfaults in malloc() and malloc_consolidate() 2 answers

My program goes in segmentation faults, and I cannot find the cause. The worst part is, the function in question does not always lead to segfault.

GDB confirms the bug and yields this backtrace:

Program received signal SIGSEGV, Segmentation fault.
0xb7da6d6e in malloc_consolidate (av=<value optimized out>) at malloc.c:5169
5169  malloc.c: No such file or directory.
  in malloc.c
(gdb) bt
#0  0xb7da6d6e in malloc_consolidate (av=<value optimized out>) at malloc.c:5169
#1  0xb7da9035 in _int_malloc (av=<value optimized out>, bytes=<value optimized out>) at malloc.c:4373
#2  0xb7dab4ac in __libc_malloc (bytes=525) at malloc.c:3660
#3  0xb7f8dc15 in operator new(unsigned int) () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#4  0xb7f72db5 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Rep::_S_create(unsigned int, unsigned int, std::allocator<char> const&) ()
   from /usr/lib/i386-linux-gnu/libstdc++.so.6
#5  0xb7f740bf in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Rep::_M_clone(std::allocator<char> const&, unsigned int) ()
   from /usr/lib/i386-linux-gnu/libstdc++.so.6
#6  0xb7f741f1 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::reserve(unsigned int) () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#7  0xb7f6bfec in std::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::overflow(int) () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#8  0xb7f70e1c in std::basic_streambuf<char, std::char_traits<char> >::xsputn(char const*, int) () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#9  0xb7f5b498 in std::ostreambuf_iterator<char, std::char_traits<char> > std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<unsigned long>(std::ostreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, char, unsigned long) const () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#10 0xb7f5b753 in std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::do_put(std::ostreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, char, unsigned long) const () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#11 0xb7f676ac in std::basic_ostream<char, std::char_traits<char> >& std::basic_ostream<char, std::char_traits<char> >::_M_insert<unsigned long>(unsigned long) ()
   from /usr/lib/i386-linux-gnu/libstdc++.so.6
#12 0xb7f67833 in std::basic_ostream<char, std::char_traits<char> >::operator<<(unsigned int) () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#13 0x08049c42 in sim::Address::GetS (this=0xbfffec40) at address.cc:27
#14 0x0806a499 in sim::UserGenerator::ProcessEvent (this=0x80a1af0, e=...) at user-generator.cc:59
#15 0x0806694b in sim::Simulator::CommunicateEvent (this=0x809f970, e=...) at simulator.cc:144
#16 0x0806685d in sim::Simulator::ProcessNextEvent (this=0x809f970) at simulator.cc:133
#17 0x08065d76 in sim::Simulator::Run (seed=0) at simulator.cc:53
#18 0x0807ce85 in main (argc=1, argv=0xbffff454) at main.cc:75
(gdb) f 13
#13 0x08049c42 in sim::Address::GetS (this=0xbfffec40) at address.cc:27
27    oss << m_address;
(gdb) p this->m_address
$1 = 1

Method GetS of class Address translates a number (uint32_t m_address) into a string and returns it. The code (very simple) is the following:

std::string
Address::GetS () const
{
  std::ostringstream oss;
  oss << m_address;
  return oss.str ();
}

Besides, as can be seen in the backtrace, m_address is properly defined.

Now, I have tried to run my program using valgrind. The program doesn't crash, likely due to the fact that valgrind replaces malloc () among other functions.

The error summary shows no memory leaking:

LEAK SUMMARY:
   definitely lost: 0 bytes in 0 blocks
   indirectly lost: 0 bytes in 0 blocks
     possibly lost: 4,367 bytes in 196 blocks
   still reachable: 9,160 bytes in 198 blocks
        suppressed: 0 bytes in 0 blocks

All possibly lost refer to backtraces like this:

80 bytes in 5 blocks are possibly lost in loss record 3 of 26
   at 0x4024B64: operator new(unsigned int) (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
   by 0x40DBDB4: std::string::_Rep::_S_create(unsigned int, unsigned int, std::allocator<char> const&) (in /usr/lib/i386-linux-gnu/libstdc++.so.6.0.16)
   by 0x40DE077: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib/i386-linux-gnu/libstdc++.so.6.0.16)
   by 0x40DE1E5: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) (in /usr/lib/i386-linux-gnu/libstdc++.so.6.0.16)
   by 0x806AF62: sim::UserGenerator::CreateUser(unsigned int) (user-generator.cc:152)

I don't think this is related to the bug. However, the code in question can be found following this link.

I am thinking of a bug in libstdc++. However, how likely would that be? I have also upgraded such library. Here's the versions currently installed on my system.

$ dpkg -l | grep libstdc
ii  libstdc++5          1:3.3.6-23  The GNU Standard C++ Library v3
ii  libstdc++6          4.6.1-1     GNU Standard C++ Library v3
ii  libstdc++6-4.1-dev  4.1.2-27    The GNU Standard C++ Library v3 (development files)
ii  libstdc++6-4.3-dev  4.3.5-4     The GNU Standard C++ Library v3 (development files)
ii  libstdc++6-4.4-dev  4.4.6-6     GNU Standard C++ Library v3 (development files)
ii  libstdc++6-4.5-dev  4.5.3-3     The GNU Standard C++ Library v3 (development files)
ii  libstdc++6-4.6-dev  4.6.1-1     GNU Standard C++ Library v3 (development files)

Now the thing is, I am not sure which version g++ uses, and whether there's some means to enforce the use of a particular version.

What I am pondering is to modify GetS. But this is the only method I know. Do you suggest any alternative?

Eventually, I am even considering to replace std::string with simpler char*. Maybe a little drastic, but I wouldn't set it aside.

Any thought in merit?

Thank you all in advance.

Best, Jir

回答1:

Ok. This is NOT the problem:

I am thinking of a bug in libstdc++

The problem is that you overwrote some memory buffer and corrupted one of the structures used by the memory manager. The hard part is going to be finding it. Does not valgrind give you information about writting past the end of an allocated piece of memory.

Don't do this:

Eventually, I am even considering to replace std::string with simpler char*. Maybe a little drastic, but I wouldn't set it aside.

You already have enough problems with memory management. This will just add more problems. There is absolutely NOTHING wrong with std::string or the memory management routines. They are heavily tested and used. If there was something wrong people all over the world would start screaming (it would be big news).

Reading your code at http://mercurial.intuxication.org/hg/lte_sim/file/c2ef6e0b6d41/src/ it seems like you are still stuck in a C style of writting code (C with Classes). So you have the power of C++ to automate (the blowing up of your code) but still have all the problems associated with C.

You need to re-look at your code in terms of ownership. You pass things around by pointer way too much. As a result it is hard to follow the ownership of the pointer (and thus who is responsible for deleting it).

I think you best bet at finding the bug is to write unit tests for each class. Then run the unit tests through val-grind. I know its a pain (but you should have done it to start with now you have the pain all in one go).