Bison token is rest of the string

2019-07-15 19:28发布

问题:

I have written a flex and bison, I am facing a problem which is illustrated via the below program.

The program is intended to parse the key-value pairs separated by an equals (=) sign I am hoping that my bison script tokenizes the key and values and prints them.

Below is the snippet of my flex program

%{
    /* file : kvp.l */
    #include <stdio.h>
    #define YYSTYPE char*
    #include "kvp.tab.h"
%}

%%
[a-zA-Z0-9][_a-zA-Z0-9]*    {
                                yylval=yytext;
                                return IDENTIFIER;
                            }
"="                         {
                                yylval=yytext;
                                return EQUALS_OP;
                            }
.                           ; /* Do Nothing */
%%

int yywrap()
{
    return 1;
}

Below is the Bison program

%{
    /* file : kvp.y */
    #include <stdio.h>
    #include <stdlib.h>

    /* interface to the lexer */
    int yylex();
    int yyerror (char const *s);
%}

%token IDENTIFIER EQUALS_OP
%start stmt

%%
stmt: stmt pair | pair;
pair: IDENTIFIER EQUALS_OP IDENTIFIER
                                        {
                                            printf("YACC : Key     = \"%s\"\n", $1);
                                            printf("YACC : Equals  = \"%s\"\n", $2);
                                            printf("YACC : Value   = \"%s\"\n", $3);
                                        };
%%

int yyerror (char const *s)
{
    fprintf (stderr, "Error String = \"%s\"\n", s);
}

int main( int argc, char* argv[])
{
    yyparse();
    return 0;
}

The result which I am getting out of this program is as below.

student@debian:~/stack-overflow$ flex kvp.l
student@debian:~/stack-overflow$ bison -d --file-prefix=y kvp.y
student@debian:~/stack-overflow$ gcc lex.yy.c  y.tab.c -o parser
student@debian:~/stack-overflow$ ./parser 
earth=planet
YACC : Key     = "earth=planet"
YACC : Equals  = "=planet"
YACC : Value   = "planet"

As one can see the tokens which are being printed is not right, the entire string beginning from the token is being printed. Please tell me where I am going wrong in this program.

回答1:

The problem is that you are saving a pointer to yytext, which is later updated in the flex scanner. If you change that to save a copy of the string in yytext, you will get more consistent results, e.g.,

yylval = strdup(yytext);

rather than

yylval = yytext;

Further reading:

  • Re: [Flex-help] yytext return as char*
  • 21.3 A Note About yytext And Memory