I am looking for a solution to parse asn.1 spec files and generate a decoder from those.
Ideally I would like to work with Python modules, but if nothing is available I would use C/C++ libraries and interface them with Python with the plethora of solutions out there.
In the past I have been using pyasn1 and building everything by hand but that has become too unwieldly.
I have also looked superficially to libtasn1 and asn1c. The first one had problems parsing even the simplest of files. The second has a good parser but generating C code for decoding seems too complex; the solution worked well with straightforward specs but choked on complex ones.
Any other good alternatives I may have overlooked?
Never tried them but:
Both seems to do what you want (C, not Python).
There is an ANTLR ASN.1 grammar; using ANTLR, you should be able to make an ASN.1 parser out of it. Generating code for pyasn1 is left as an exercise to the poster :-)
I have done a similar job using asn1c and building around it a Pyrex extension. The wrapped structure is described in 3GPP TS 32.401.
With Pyrex you can write a wrapper thick enough to convert between native Python data types and the correct ASN.1 representations (wrapper generators, such SWIG, tend to not perform complex operations on the type). The wrapper I wrote also tracked the ownership of the underlying C data structures (e.g. accessing to a sub-structure, a Python object was returned, but there was no copy of the underlying data, only reference sharing).
The wrapper was eventually written in a kind of semi-automatic way, but because that has been my only job with ASN.1 I never did the step of completely automatize the code generation.
You can try to use other Python-C wrappers and perform a completely automatic conversion: the job would be less, but then you would move complexity (and repetitive error-prone operations) to the structure users: for this reason I preferred the Pyrex way. asn1c was definitely a good choice.
I recently created the Python package called asn1tools which compiles an ASN.1 specification into Python objects, which can be used to encode and decode messages.
I have experience with pyasn1 and it's enough to parse quite complex grammars. A grammar is expressed with python structure, so no need to run code generator.
I'm the author of LEPL, a parser written in Python, and what you want to do is one of the things on my "TODO" list.
I will not be doing this soon, but you might consider using LEPL to construct your solution because:
1 - it's a pure Python solution (which makes life simpler)
2 - it can already parse binary data as well as text, so you would only need to use a single tool - the same parser that you would use to parse the ASN1 spec would then be used to parse the binary data
The main downsides are that:
1 - it's a fairly new package, so it may be buggier than some, and the support community is not that large
2 - it is restricted to Python 2.6 and up (and the binary parser only works with Python 3 and up).
For more information, please see http://www.acooke.org/lepl - in particular,for binary parsing see the relevant section of the manual (I cannot link directly to that as Stack Overflow seems to think I am spamming)
Andrew
PS The main reason this is not something I have already started is that the ASN 1 specs are not freely available, as far as I know. If you have access to them, and it is not illegal(!), a copy would be greatly appreciated (unfortunately I am currently working on another project, so this would still take time to implement, but it would help me get this working sooner...).