What does the \\? (backslash question mark) escape

2019-06-22 13:40发布

问题:

I'm writing a regular expression in Objective-C.

The escape sequence \w is illegal and emits a warning, so the regular expression /\w/ must be written as @"\\w"; the escape sequence \? is valid, apparently, and doesn't emit a warning, so the regular expression /\?/ must be written as @"\\?" (i.e., the backslash must be escaped).

Question marks aren't invisible like \t or \n, so why is \? a valid escape sequence?

Edit: To clarify, I'm not asking about the quantifier, I'm asking about a string escape sequence. That is, this doesn't emit a warning:

NSString *valid = @"\?";

By contrast, this does emit a warning ("Unknown escape sequence '\w'"):

NSString *invalid = @"\w";

回答1:

It specifies a literal question mark. It is needed because of a little-known feature called trigraphs, where you can write a three-character sequence starting with question marks to substitute another character. If you have trigraphs enabled, in order to write "??" in a string, you need to write it as "?\?" in order to prevent the preprocessor from trying to read it as the beginning of a trigraph.

(If you're wondering "Why would anybody introduce a feature like this?": Some keyboards or character sets didn't include commonly used symbols like {. so they introduced trigraphs so you could write ??< instead.)



回答2:

? in regex is a quantifier, it means 0 or 1 occurences. When appended to the + or * quantifiers, it makes it "lazy".

For example, applying the regex o? to the string foo? would match o.

However, the regex o\? in foo? would match o?, because it is searching for a literal question mark in the string, instead of an arbitrary quantifier.

Applying the regex o*? to foo? would match oo.

More info on quantifiers here.