Parsec or happy (with alex) or uu-parsinglib

2019-01-31 09:38发布

问题:

I am going to write a parser of verilog (or vhdl) language and will do a lot of manipulations (sort of transformations) of the parsed data. I intend to parse really big files (full Verilog designs, as big as 10K lines) and I will ultimately support most of the Verilog. I don't mind typing but I don't want to rewrite any part of the code whenever I add support for some other rule.

In Haskell, which library would you recommend? I know Haskell and have used Happy before (to play). I feel that there are possibilities in using Parsec for transforming the parsed string in the code (which is a great plus). I have no experience with uu-paringlib.

So to parse a full-grammar of verilog/VHDL which one of them is recommended? My main concern is the ease and 'correctness' with which I can manipulate the parsed data at my whim. Speed is not a primary concern.

回答1:

I personally prefer Parsec with the help of Alex for lexing.

I prefer Parsec over Happy because 1) Parsec is a library, while Happy is a program and you'll write in a different language if you use Happy and then compile with Happy. 2) Parsec gives you context-sensitive parsing abilities thanks to its monadic interface. You can use extra state for context-sensitive parsing, and then inspect and decide depending on that state. Or just look at some parsed value before and decide on next parsers etc. (like a <- parseSomething; if test a then ... do ...) And when you don't need any context-sensitive information, you can simply use applicative style and get an implementation like implemented in YACC or a similar tool.

As a downside of Parsec, you'll never know if your Parsec parser contains a left recursion, and your parser will get stuck in runtime (because Parsec is basically a top-down recursive-descent parser). You have to find left recursions and eliminate them. YACC-style parsers can give you some static guarantees and information (like shift/reduce conflicts, unused terminals etc.) that you can't get with Parsec.

Alex is highly recommended for lexing in both situations (I think you have to use Alex if you decide to go on with Happy). Because even if you use Parsec, it really simplifies your parser implementation, and catches a great deal of bugs too (for example: parsing a keyword as an identifier was a common bug I did while I was using Parsec without Alex. It's just one example).

You can have a look at my Lua parser implemented in Alex+Parsec And here's the code to use Alex-generated tokens in Parsec.

EDIT: Thanks John L for corrections. Apparently you can do context-sensitive parsing with Happy too. Also, Alex for lexing is not required in Happy, though it's recommended.