Regex to replace single backslashes, excluding tho

2020-03-20 06:55发布

问题:

I have a regex expression which removes any backslashes from a string if not followed by one of these characters: \ / or }.

It should turn this string:

foo\bar\\batz\/hi

Into this:

foobar\\batz\/hi

But the problem is that it is dealing with each backslash as it goes along. So it follows the rule in that it removes that first backslash, and ignores the 2nd one because it is followed by another backslash. But when it gets to the 3rd one, it removes it, because it isn't followed by another.

My current code looks like this: str.replace(/\\(?!\\|\/|\})/g,"")

But the resulting string looks like this: foobar\batz\/hi

How do I get it to skip the 3rd backslash? Or is it a case of doing some sort of explicit negative search & replace type thing? Eg. replace '\', but don't replace '\\', '\/' or '\}'?

Please help! :)

EDIT

Sorry, I should have explained - I am using javascript, so I don't think I can do negative lookbehinds...

回答1:

You need to watch out for an escaped backslash, followed by a single backslash. Or better: an uneven number of successive backslashes. In that case, you need to keep the even number of backslashes intact, and only replace the last one (if not followed by a / or {).

You can do that with the following regex:

(?<!\\)(?:((\\\\)*)\\)(?![\\/{])

and replace it with:

$1

where the first match group is the first even number of backslashes that were matched.

A short explanation:

(?<!\\)          # looking behind, there can't be a '\'
(?:((\\\\)*)\\)  # match an uneven number of backslashes and store the even number in group 1
(?![\\/{])       # looking ahead, there can't be a '\', '/' or '{'

In plain ENglish that would read:

match an uneven number of back-slashes, (?:((\\\\)*)\\), not followed by \\ or { or /, (?![\\/{]), and not preceded by a backslash (?<!\\).

A demo in Java (remember that the backslashes are double escaped!):

String s = "baz\\\\\\foo\\bar\\\\batz\\/hi";
System.out.println(s);
System.out.println(s.replaceAll("(?<!\\\\)(?:((\\\\\\\\)*)\\\\)(?![\\\\/{])", "$1"));

which will print:

baz\\\foo\bar\\batz\/hi
baz\\foobar\\batz\/hi

EDIT

And a solution that does not need look-behinds would look like:

([^\\])((\\\\)*)\\(?![\\/{])

and is replaced by:

$1$2

where $1 is the non-backslash char at the start, and $2 is the even (or zero) number of backslashes following that non-backslash char.



回答2:

The required regex is as simple as \\.

You need to know however, that the second argument to replace() can be a function like so:

result = string.replace(/\\./g, function (ab) { // ab is the matched portion of the input string
    var b = ab.charAt(1);
    switch (b) { // if char after backslash
    case '\\': case '}': case '/': // ..is one of these
        return ab; // keep original string
    default: // else
        return b; // replace by second char
    }
});


回答3:

You need a lookahead, like you have, and also a lookbehind, to ensure that you dont delete the second slash (which clearly doesnt have a special character after it. Try this:

(?<![\\])[\\](?![\\\/\}]) as your regex