I have an assignment to create a lexical analyser and I've got everything working except for one bit. I need to create a string that will accept a new line, and the string is delimited by double quotes. The string accepts any number, letter, some specified punctuation, backslashes and double quotes within the delimiters. I can't seem to figure out how to escape a new line character. Is there a certain way of escaping characters like new line and tab?
Here's some of my code that might help
< STRING : ( < QUOTE> (< QUOTE > | < BACKSLASH > | < ID > | < NUM > | " " )* <QUOTE>) >
< #QUOTE : "\"" >
< #BACKSLASH : "\\" >
So my string should allow for a quote, then any of the following characters like a backslash, a whitespace, a number etc, and then followed by another quote. The newline char like "\n" is what's not working. Thanks in advance!
For string literals, JavaCC borrows the syntax of Java. So, a single-character literal comprising a carriage return is escaped as
"\r"
, and a single-character literal comprising a line feed is escaped as "\n
".However, the processed string value is just a single character; it is not the escape itself. So, suppose you define a token for line feed:
A match of the token
<LF>
will be a single line-feed character. When substituting the token in the definition of another token, the single character is effectively substituted. So, suppose you have the higher-level definition:A match of the token
<STRING>
will be three characters: a quotation mark, followed by a line feed, followed by a quotation mark. What you seem to want instead is for the escape sequence to be recognized:Now a match of the token
<STRING>
will be four characters: a quotation mark, followed by an escape sequence representing a line feed, followed by a quotation mark.In your current definition, I see that other often-escaped metacharacters like quotation mark and backslash are also being recognized literally, rather than as escape sequences.