Recursive parsing based on level numbers

2019-07-02 12:59发布

问题:

I have a tricky question (at least in my perspective), regarding Scalas parser combinators and recursive parsing. I currently building a small parser which should be able to parse PL/1 structures like this:

  dcl 1 data,
    3 subData,
      5 tmp char(15),
      5 tmp1 char(15),
    3 subData2,
      5 tmp2 char(10),
      5 tmp3 char(5);

In this scenario I want to build a AST as follows:

Record(data)  -> (Record(subData),Record(subData2))

  Record(subData) -> (Char(tmp),Char(tmp1))

  Record(subData2) -> (Char(tmp2),Char(tmp3))  

Meaning that the parent element should be connected to its children. In my world this should result in a recursive parser in some way, however my problem is how to control when to stop going down in sublevels. For instance when parsing the the "3 subdata" record structure, it should stop when hitting a level number which is not lower in itself, in this case the line "3 subData2,".

Can someone help with this issue, or point me in a good direction. My current solution is to solved this problem after I have parsed an unconnected structure.

Thanks in advance.

Regards Stefan

回答1:

Basically all you need is Parser.into (it has an alias >>), it creates a parser combinator that's based on the result of the current one.

I've prepared a simple REPLable example for you

import util.parsing.combinator.RegexParsers

case class Record(data: String, children: List[Record])

object RecordParser extends RegexParsers {
  override def skipWhitespace = false

  def ws = "\\s*".r
  def numberGT(min: Int) = "\\d+".r ^^ {  _.toInt } ^? {
    case i if i > min => i
  }

  def subData(n: Int): Parser[Record] = ws ~> numberGT(n) ~ ws ~ ".*".r <~ "\n" >> {
    case sub ~ _ ~ data => rep(subData(sub)) ^^ { new Record(data, _) }
  }
}

val testData = """
1 data
 2 subdata
  3 child1
  3 child2
 2 sub2
"""

RecordParser.parse(RecordParser.subData(0),test)
res7: RecordParser.ParseResult[Record] = [7.1] parsed: Record(data,List(Record(subdata,List(Record(child1,List()), Record(child2,List()))), Record(sub2,List())))