lex program on counting no of comment lines

2019-07-25 12:42发布

问题:

here the program counts the no of comment lines, single line comments and multi line comments and gives a total comments output with a file.txt as input

file.txt

//hellow world
/*hello world1*/
/*hello world2
*/
/*hello world3
hello world3.1*/
#include<>

count.l

    %{
    #include<stdio.h>
    #include<stdlib.h>
    int a=0,b=0,c=0,d;
%}
%%
"//".* {a++;}
"/*" {b++;}
.*"*/" {b--;c++;}
%%
void main(int argc,char *argv[]){
    yyin=fopen(argv[1],"r");
    yylex();
    printf("single line %d \nmultiline %d \n",a,c);
    d=a+c;
    printf("total %d \n",d);
}

here the output i get is

./a.out file.txt

hello world2 

hello world3



#include<>
single line 1 
multiline 3 
total 4 

the output that i need to get is just

#include<>
single line 1 
multiline 3 
total 4 

I tried this way too i added .* infront of "/*" like this "/*".* then it would delete the "*/" in that line and give me the multi line comment count as 2. i tried various ways but im kinda stuck.

回答1:

This is what start states are for -- they allow you to define a different match rules for different states:

%{
#include<stdio.h>
#include<stdlib.h>
int a=0,b=0,c=0,d;
%}
%x COMMENT    /* an exclusive state that does not also match normal stuff */
%%
"//".*   {a++;}
"/*"     { BEGIN COMMENT; }
<COMMENT>"*/" {c++; BEGIN INITIAL; }
<COMMENT>.    ;
%%
void main(int argc,char *argv[]){
    yyin=fopen(argv[1],"r");
    yylex();
    printf("single line %d \nmultiline %d \n",a,c);
    d=a+c;
    printf("total %d \n",d);
}

This will deal properly with things like

/*  //  */  ..this is not a comment..

that will confuse most other ways of attempting to do this. It also continues to output newlines that were in comments (so a muliline /../ comments will show up as blank lines. If you don't want that, you can add a rule for <COMMENT>\n



回答2:

added precise logic to make it work better.

    %{
    #include<stdio.h>
    #include<stdlib.h>
    int a=0,c=0,d,e=0;
%}
%%
"/*" {if(e==0)e++;}
"*/" {if(e==1)e=0;c++;}
"//".* {if(e==0)a++;}
. {if(e==0)ECHO;}
%%
void main(int argc,char *argv[]){
    yyin=fopen(argv[1],"r");
    yyout=fopen(argv[2],"w");
    yylex();
    printf("single line %d \nmultiline %d \n",a,c);
    d=a+c;
    printf("total %d \n",d);
}


回答3:

%{
    #include<stdio.h>
    #include<stdlib.h>
    int a=0,b=0,d;
%}
%%
"//".* {a++;}
[/][*][^*]*[*]+([^*/][^*]*[*]+)*[/] {b++;}

%%
void main(int argc,char *argv[]){
    yyin=fopen(argv[1],"r");
    yylex();
    printf("single line %d \nmultiline %d \n",a,b);
    d=a+b;
    printf("total %d \n",d);
}

The . matches everything else. This does the trick.

Also as pointed on your answer you have just printed the rest of the characters by using ..