How to reference lex or parse parameters in flex r

2019-08-12 18:54发布

问题:

I know I can declare %parse-param {struct my_st *arg} in a .y file. And so yyparse() is changed to be yyparse(struct my_st *arg). But how do I reference the argument in the flex rules? For example:

[0-9]+  { do_work(arg); return NUMBER; }

I want to make a reentrant parser, so I need to do this. Please help me, thanks!

回答1:

You need to pass the argument through to yylex. That requires a modification of both the bison parser description, so that the parser calls yylex with the desired arguments, and the flex scanner description, so that the scanner generates yylex with the desired parameters.

Bison and flex do not communicate with each other and do not see each other's source files. However, it is normal for the scanner to #include the header file generated by bison, and bison allows the possibility of inserting code directly into this header file. That makes it possible to put the entire configuration into the bison file.

In bison, you specify additional parameters for yylex using the %lex-param directive. But you need to be aware of the additional argument(s) which would also be added to the call if you %define api.pure.

If you use bison 3.0 or more recent, you can use

%param { struct my_st *arg }

as an abbreviation for

%lex-param { struct my_st *arg }
%parse-param { struct my_st *arg }

It makes sense to use a single directive (if your bison is sufficiently recent) because there is no way to insert a local variable declaration into the yyparse function. So the only variables which could be passed through to yylex are global variables and parameters to yyparse. [Note 1]

Remember that it is your responsibility to declare yylex and yyerror in the bison file. Even if you use %lex-param, bison will not automatically generate a declaration for yylex. [Note 2]

Flex does normally generate a declaration for yylex, so you cannot simply put your declaration into the bison-generated header file and then #include it into the scanner. However, if the YY_DECL macro is defined, then the flex-generated scanner will not forward-declare yylex, and it will use the YY_DECL macro in the definition of yylex. You can use this feature to put the declaration of yylex into the bison description in a way that it will be passed through to the flex scanner.

In bison, you can add declarations to the generated header using either %code requires sections or %code provides sections. The difference is that requires segments are earlier in the header file, before YYSTYPE and YYLTYPE have been declared. If you use a pure parser, the yylex prototype will usually refer to YYSTYPE (and YYLTYPE, if you use locations), so it needs to go in a %code provides section. In order to interface gracefully with flex, you can use the YY_DECL macro to generate the yylex declaration.

So you might end up with something like the following: [Note 3]

file: mylanguage.y

%code requires {
  #include <stdio.h>

  typedef struct Context { ... } Context;

  /* structs used in the %union declaration would go here */
}

%define api.pure full
%locations
%parse-param { Context* context }
%lex-param { Context* context }

%code provides {
   #define YY_DECL \
       int yylex(YYSTYPE* yylvalp, YYLTYPE* yyllocp, Context* context)
   YY_DECL;

   int yyerror(YYLTYPE* yyllocp, Context* context, const char* message);
}

Then you would just insert the generated header into your flex file in the normal way:

file: mylanguage.l

%{
   /* System library includes */
   #include "mylanguage.tab.h"
%}

The %define api.pure full declaration in bison avoids the need for the global variables yylval and yylloc. However, there are a number of other internal globals used by a flex-generated scanner; in order to make the scanner truly re-entrant, you need to add %option reentrant to your flex file. With that option, yylex is expected to include the parameter yyscan_t yyscanner (as are all the other lexer-related functions defined by flex). You need to manage the yyscanner value, so it will need to be passed through yyparse to yylex as above. You also need to initialize and destroy it, as described in the flex manual. (Its name in yylex must be precisely yyscanner, if you are generating the yylex prototype using YY_DECL.)

If you also want to pass your own context object, you could add two parameters to yyparse and yylex, or you could include your context object inside the yyscan_t object as described in the Flex manual section on Extra Data.

Finally, if you use the bison pure parser API, then you need to change the way you write flex actions. Instead of assigning to the global yylval (eg. yylval.integer = atol(yytext);), you need to assign through the pointer passed as an argument: yylvalp->integer = atol(yytext);. (The name of the argument is up to you; the one I use here is the one I specified above.)


Notes

  1. Older implementations allowed you to specify the arguments to yylex by defining the macro YYLEX_PARAM. This is no longer supported as of bison 3.0, so you shouldn't use it.

  2. If you use %parse-param, the additional parameter will also be added to yyerror. And if you %define api-pure full, yyerror will also receive a location object. Your declaration of yyerror needs to be consistent.

  3. The %locations directive forces bison to generate code which stores location information for each token. I use it here because it makes the prototypes predictable. Without it, the prototypes would include YYLTYPE arguments only if you actually refer to a location somewhere in a semantic action. If you don't intend to use token locations, you might prefer to remove the %locations directive and all the YYLTYPE arguments. But usually the location information is useful.