Regex removing anything that is not a 14-digit num

2019-06-05 23:49发布

问题:

I'm trying to invert this expression: ([0-9]{14} ), so all 14 digit numbers followed by a space.

I looked everywhere, and it seems that the best way should be using negative lookahead.
But when I try apply q(?!u) to my case >> (?!([0-9]{14} )), it doesn't work.

What am I doing wrong?

I will appreaciate any advice, thank you.

The point is to remove everything that is not a 14-digit chunk of text while preserving those 14-digit chunks.

回答1:

If you want to delete text other than 14 digits followed with a space, use (\b\d{14} )|. and replace with $1.

The pattern matches and captures (we can refer to the text captured with a backreference in the replacement pattern) the 14-digit chunks and then a space as whole word due to \b (a word boundary). If this text is not found, any character other than a newline is matched with . and is not captured (we cannot refer to it with a backreference).

Thus, when we replace with a backreference $1, we just restore the matched 14 digit chunk with a space.

See the regex demo at regex101.com.

To get the cleaner view, remove all empty lines: Edit > Line Operations > Remove Empty Lines.



回答2:

You can use this negative lookahead:

^(?!.*[0-9]{14} )
  • Make sure you use start anchor ^
  • Also important to use .* before your pattern to disallow this anywhere in input