Minimal bison/flex-generated code has memory leak

2019-08-01 03:53发布

问题:

In debugging a memory leak on a large project, I found that the source of the leak seemed to be some flex/bison-generated code. I was able to recreate the leak with the following minimal example consisting of two files, sand.l and sand.y:

in sand.l:

%{
#include <stdlib.h>
#include "sand.tab.h"
%}

%%
[0-9]+ { return INT; }
. ;
%%

in sand.y:

%{
#include <stdio.h>
#include <stdlib.h>

int yylex();
int yyparse();
FILE* yyin;

void yyerror(const char* s);
%}

%token INT

%%
program:
       program INT { puts("Found integer"); }
       | 
       ;
%%

int main(int argc, char* argv[]) {
    yyin = stdin;
    do {
        yyparse();
    } while (!feof(yyin));
    return 0;
}

void yyerror(const char* s) {
    puts(s);
}

The code was compiled with

$ bison -d sand.y
$ flex sand.l
$ gcc -g lex.yy.c sand.tab.c -o main -lfl

Running the program with valgrind gave the following error:

8 bytes in 1 blocks are still reachable in loss record 1 of 3
at 0x4C2AC3D: malloc (vg_replace_malloc.c:299)
by 0x40260F: yyalloc (lex.yy.c:1723)
by 0x402126: yyensure_buffer_stack (lex.yy.c:1423)
by 0x400B89: yylex (lex.yy.c:669)
by 0x402975: yyparse (sand.tab.c:1114)
by 0x402EC4: main (sand.y:24)

64 bytes in 1 blocks are still reachable in loss record 2 of 3
at 0x4C2AC3D: malloc (vg_replace_malloc.c:299)
by 0x40260F: yyalloc (lex.yy.c:1723)
by 0x401CBF: yy_create_buffer (lex.yy.c:1258)
by 0x400BB3: yylex (lex.yy.c:671)
by 0x402975: yyparse (sand.tab.c:1114)
by 0x402EC4: main (sand.y:24)

16,386 bytes in 1 blocks are still reachable in loss record 3 of 3
at 0x4C2AC3D: malloc (vg_replace_malloc.c:299)
by 0x40260F: yyalloc (lex.yy.c:1723)
by 0x401CF6: yy_create_buffer (lex.yy.c:1267)
by 0x400BB3: yylex (lex.yy.c:671)
by 0x402975: yyparse (sand.tab.c:1114)
by 0x402EC4: main (sand.y:24)

It seems that bison and/or flex is holding on to a substantial amount of memory. Is there anyway to force them to free it?

回答1:

The default flex skeleton allocates an input buffer and a small buffer stack, which it never frees. You could free the input buffer manually with yy_delete_buffer(YY_CURRENT_BUFFER); but there is no way to delete the buffer stack. (It's only 8 bytes in your application, so it's not a disaster.)

If you want to write a clean application, you should generate a reentrant scanner, which puts all persistent data into a scanner context object. Your code must allocate and free this object, and freeing it will free all memory allocations. (You might also want to generate a pure parser, which works roughly the same way.)

However, the reentrant scanner has a very different API, so you will need to get your parser to pass through the scanner context object. If you use a reentrant (pure) parser as well, you'll need to modify your scanner actions because with the reentrant parser, yylval is a YYSTYPE* instead of YYSTYPE.