I'm working on a utility which needs to resolve hex addresses to a symbolic function name and source code line number within a binary. The utility will run on Linux on x86, though the binaries it analyzes will be for a MIPS-based embedded system. The MIPS binaries are in ELF format, using DWARF for the symbolic debugging information.
I'm currently planning to fork objdump, passing in a list of hex addresses and parsing the output to get function names and source line numbers. I have compiled an objdump with support for MIPS binaries, and it is working.
I'd prefer to have a package allowing me to look things up natively from the Python code without forking another process. I can find no mention of libdwarf, libelf, or libbfd on python.org, nor any mention of python on dwarfstd.org.
Is there a suitable module available somewhere?
I don't know of any, but if all else fails you could use ctypes to directly use libdwarf, libelf or libbfd.
You should give Construct a try. It is very useful to parse binary data into python objects.
There is even an example for the ELF32 file format.
I've been developing a DWARF parser using Construct. Currently fairly rough, and parsing is slow. But I thought I should at least let you know. It may suit your needs, with a bit of work.
I've got the code in Mercurial, hosted at bitbucket:
Construct is a very interesting library. DWARF is a complex format (as I'm discovering) and pushes Construct to its limits I think.
You might be interested in the DWARF library from pydevtools:
hachior is another library for parsing binary data
Please check pyelftools - a new pure Python library meant to do this.