ocamlyacc parse error: what token?

2019-02-16 07:16发布

问题:

I'm using ocamlyacc and ocamllex. I have an error production in my grammar that signals a custom exception. So far, I can get it to report the error position:

| error { raise (Parse_failure (string_of_position (symbol_start_pos ()))) }

But, I also want to know which token was read. There must be a way---anyone know?

Thanks.

回答1:

Tokens are generated by lexer, hence you can use the current lexer token when error occurs :

  let parse_buf_exn lexbuf =
    try
      T.input T.rule lexbuf
    with exn ->
      begin
        let curr = lexbuf.Lexing.lex_curr_p in
        let line = curr.Lexing.pos_lnum in
        let cnum = curr.Lexing.pos_cnum - curr.Lexing.pos_bol in
        let tok = Lexing.lexeme lexbuf in
        let tail = Sql_lexer.ruleTail "" lexbuf in
        raise (Error (exn,(line,cnum,tok,tail)))
      end

Lexing.lexeme lexbuf is what you need. Other parts are not necessary but useful. ruleTail will concat all remaining tokens into string for the user to easily locate error position. lexbuf.Lexing.lex_curr_p should be updated in the lexer to contain correct positions. (source)



回答2:

The best way to debug your ocamlyacc parser is to set the OCAMLRUNPARAM param to include the character p - this will make the parser print all the states that it goes through, and each shift / reduce it performs.

If you are using bash, you can do this with the following command:

$ export OCAMLRUNPARAM='p'


回答3:

I think that, similar to yacc, the tokens are stored in variables corresponding to the symbols in your grammar rule. Here since there is one symbol (error), you may be able to simply output $1 using printf, etc.

Edit: responding to comment.

Why do you use an error terminal? I'm reading an ocamlyacc tutorial that says a special error-handling routine is called when a parse error happens. Like so:

3.1.5. The Error Reporting Routine

When ther parser function detects a syntax error, it calls a function named parse_error with the string "syntax error" as argument. The default parse_error function does nothing and returns, thus initiating error recovery (see Error Recovery). The user can define a customized parse_error function in the header section of the grammar file such as:

let parse_error s = (* Called by the parser function on error *)
  print_endline s;
  flush stdout

Well, looks like you only get "syntax error" with that function though. Stay tuned for more info.