How to create a parser from Regex in Scala to pars

2020-04-17 07:33发布

I am writing a parser in which I am trying to parse a path and do arithmetic calculations. since I cannot use RegexParsers with StandardTokenParsers I am trying to make my own. So I am using the following code for that which I picked a part of it from another discussion:

lexical.delimiters ++= List("+","-","*","/", "^","(",")",",")
import lexical.StringLit
def regexStringLit(r: Regex): Parser[String] =
acceptMatch( "string literal matching regex " + r,{ case  StringLit(s) if r.unapplySeq(s).isDefined => s })

def pathIdent: Parser[String] =regexStringLit ("/hdfs://([\\d.]+):(\\d+)/([\\w/]+/(\\w+\\.w+))".r)

def value :Parser[Expr] = numericLit ^^ { s => Number(s) }
def variable:Parser[Expr] =  pathIdent ^^ { s => Variable(s) }
def parens:Parser[Expr] = "(" ~> expr <~ ")"

def argument:Parser[Expr] = expr <~ (","?)
def func:Parser[Expr] = ( pathIdent ~ "(" ~ (argument+) ~ ")" ^^ { case f ~ _ ~ e ~ _ => Function(f, e) })
//some other code
 def parse(s:String) = {
    val tokens = new lexical.Scanner(s)
    phrase(expr)(tokens)
}

Then I use args(0) to send my input to the program which is : "/hdfs://111.33.55.2:8888/folder1/p.a3d+1"

and this is the error I get :

[1.1] failure: string literal matching regex /hdfs://([\d\.]+):(\d+)/([\w/]+/(\w+\.\w+)) expected

  /hdfs://111.33.55.2:8888/folder1/p.a3d
  ^

I tried simple path and also I commented the rest of the code and just left the path part there but it seems like the regexStringLit is not working for me. I think I am wrong in syntax part. I don't know!

2条回答
ら.Afraid
2楼-- · 2020-04-17 08:13

I solved it writing a trait and using JavaTokenParsers rather than StandardToken Parser.

 trait pathIdentifier extends RegexParsers{

      def pathIdent: Parser[String] ={
          """hdfs://([\d\.]+):(\d+)/([\w/]+/(\w+\.\w+))""".r
    }
}

@Tilo Thanks for your help your solution is working as well but changing extended class to JavaTokenParser helped to solve the problem.

查看更多
SAY GOODBYE
3楼-- · 2020-04-17 08:26

There are a couple of mistakes in you regex:

/hdfs://([\d.]+):(\d+)/([\w/]+/(\w+\.w+))

1) There are unnecessary parenthesis (or your forgot a +) - this is not a real mistake but makes it harder to read your regex and fix bugs.

/hdfs://[\d.]+:\d+/[\w/]+/\w+\.w+

2) The last w+ is not escaped:

/hdfs://[\d.]+:\d+/[\w/]+/\w+\.\w+

3) You only allow . but not + for the last part:

/hdfs://[\d.]+:\d+/[\w/]+/\w+([.+]\w+)+

The above expression matches your test case, however, I do suspect, you actually want this expression:

/hdfs://\d+(\.\d+){3}:\d+(/(\w+([-+.*/]\w+)*))+

查看更多
登录 后发表回答