How to use backslash escape char for new line in J

2019-02-26 23:12发布

I have an assignment to create a lexical analyser and I've got everything working except for one bit. I need to create a string that will accept a new line, and the string is delimited by double quotes. The string accepts any number, letter, some specified punctuation, backslashes and double quotes within the delimiters. I can't seem to figure out how to escape a new line character. Is there a certain way of escaping characters like new line and tab?

Here's some of my code that might help

< STRING : ( < QUOTE> (< QUOTE > | < BACKSLASH > | < ID > | < NUM > | " " )* <QUOTE>) >
< #QUOTE : "\"" >
< #BACKSLASH : "\\" >

So my string should allow for a quote, then any of the following characters like a backslash, a whitespace, a number etc, and then followed by another quote. The newline char like "\n" is what's not working. Thanks in advance!

1条回答
等我变得足够好
2楼-- · 2019-02-27 00:10

For string literals, JavaCC borrows the syntax of Java. So, a single-character literal comprising a carriage return is escaped as "\r", and a single-character literal comprising a line feed is escaped as "\n".

However, the processed string value is just a single character; it is not the escape itself. So, suppose you define a token for line feed:

< LF : "\n" >

A match of the token <LF> will be a single line-feed character. When substituting the token in the definition of another token, the single character is effectively substituted. So, suppose you have the higher-level definition:

< STRING : "\"" ( <LF> ) "\"" >

A match of the token <STRING> will be three characters: a quotation mark, followed by a line feed, followed by a quotation mark. What you seem to want instead is for the escape sequence to be recognized:

< STRING : "\"" ( "\\n" ) "\"" >

Now a match of the token <STRING> will be four characters: a quotation mark, followed by an escape sequence representing a line feed, followed by a quotation mark.

In your current definition, I see that other often-escaped metacharacters like quotation mark and backslash are also being recognized literally, rather than as escape sequences.

查看更多
登录 后发表回答