Hi I want to check a specific pattern in regular expression but I'm failed to do that. Input should be like
noun wordname:wordmeaning
I'm successful getting noun and wordname but couldn't design a pattern for word meaning. My code is :
int state;
char *meaning;
char *wordd;
^verb { state=VERB; }
^adj { state = ADJ; }
^adv { state = ADV; }
^noun { state = NOUN; }
^prep { state = PREP; }
^pron { state = PRON; }
^conj { state = CONJ; }
//my try but failed
[:\a-z] {
meaning=yytext;
printf(" Meaning is getting detected %s", meaning);
}
[a-zA-Z]+ {
word=yytext;
}
Example input:
noun john:This is a name
Now word
should be equal to john
and meaning
should be equal to This is a name
.
Agreeing that lex states (also known as start conditions) are the way to go (odd, but there are no useful tutorials).
Briefly:
at the top of the lex file, declare the states, e.g.,
%s TYPE NAME VALUE
<
>
brackets to tell lex that the patterns are used only in those states. You can list more than one state, comma-separated, when it matters. But your lex file probably does not need that.INITIAL
.your program switches states using the
BEGIN()
macro, in actions, e.g.,{ BEGIN(TYPE); }
NAME
state.in the
NAME
state, your lexer looks for whatever you think a name should be, e.g.,<NAME>[[:alpha:]][[:alnum:]]+ { my_name = strdup(yytext); }
the name ends with a colon, so
<NAME>":" { BEGIN(VALUE); }
the value is then everything until the end of the line, e.g.,
<VALUE>.* { my_value = strdup(yytext); BEGIN(INITIAL); }
INITIAL
orTYPE
depends on what other things you might add to your lexer (such as ignoring comment lines and whitespace).Further reading: