Scala Parser that sometimes skips whitespace and s

2019-06-20 08:17发布

I've got a working Scala parser but the solution is not as clean as I would like. The problem is that some of the productions must consider whitespace as part of the token but the "higher-level" productions should be able to ignore/skip the whitespace.

If I use the typical scala parser pattern of extending the lower level parsers then the skipWhitespace settings are inherited and things get messy very quickly.

I think I would be better off not using the extends approach but rather have an instance of the low level parser available in the higher level parsers' class -- but I'm not sure how to make that work, such that each instance would see only one stream of input characters.

Here is part of the lowest-level parser -

class VulgarFractionParser extends RegexParsers  {
  override type Elem = Char

 override val whiteSpace = "".r

Then I extend that like

class NumberParser extends VulgarFractionParser with Positional {

But at this point the NumberParser must explicitly handle whitespace just like the FractionParser. For the NumberParser it is still pretty manageable - but at the next level up I really want to be able to just define productions that do use whitespace as a separator just like a normal regexParser would do.

An example would be something like:

IBM 33.33/ 1200.00
or
IBM 33.33/33.50 1200.00

The 2nd value sometimes has two parts separated by a "/" and sometimes only has a single part with nothing after the slash (or even not containing a slash at all).

   def bidOrAskPrice = ("$"?) ~> (bidOrAskPrice1 | bidOrAskPrice2 | bidOrAskPrice3) 

   def bidOrAskPrice1 = number ~ ("/".r) ~ number ~ (SPACES) ^^ { 
     case a ~ slash ~ b ~ sp1    => BidOrAsk(a,Some(b))
  }
  def bidOrAskPrice2 = (number ~ "/" ~ (SPACES)) ^^ { case a ~ slash ~ sp => BidOrAsk(a,None) }
   def bidOrAskPrice3 = (number ~ (SPACES?)) ^^ { case a ~ sp => BidOrAsk(a , None)}

标签: parsing scala
2条回答
孤傲高冷的网名
2楼-- · 2019-06-20 08:32

One solution is to override the handleWhiteSpace function and activate skipping whitespace with a var value in your extended class.

You can see the code of RegexParsers here : https://github.com/scala/scala/blob/v2.9.2/src/library/scala/util/parsing/combinator/RegexParsers.scala

查看更多
相关推荐>>
3楼-- · 2019-06-20 08:43

Doesn't it make more sense to turn the first parser into a token parser (a lexer, really), and make the second parser read that instead of plain Char?

查看更多
登录 后发表回答