regex to find variables surrounded by % in a strin

2019-09-21 13:31发布

问题:

Need to find "variables" within a string. What denotes a variable is %[/w]+%, the catch is there can be more than one variable within the string:

%ABC%
%ABC%-%RED%
Lorem ipsum %GeT% sit amet, %% consectetur %QW23% elit. 

In the third example the %% should NOT be found, it will be replaced with a single %. Something like #[\w+-]+# does not work because it cannot determine that in the second line it is %ABC% and %RED%, but rather %-%. I am under the impression that both groups and back references needs to be used but I cannot find any good example to explain how to do this in Java.


Folks are asking for some answers to questions, so here you go:

What exactly am I expecting as the final output? Well, as the subject suggests the %ABC% is a 'variable' that is defined somewhere else, the final goal is to "find the variable and replace it with the correct value". The goal with the regular expression is to find all the 'variables' within a string.

So, there is a map somewhere in memory where:

ABC = "mike"
RED = "Red Storm"
GeT = "hometown"
QW23 = "Quick and easy"

(side note: if the keys need to have % around the name, that is OK, too)

The goal of the regex is to 'find' the variable, so in the first string it will find ABC (or %ABC%) so that the code and look up ABC to determine that the correct value is mike, and so on... Here is the desired output of the strings given:

mike
mike-Red Storm
Lorem ipsum hometown sit amet, % consectetur Quick and easy elit. 

I am not expecting the reg expression to actually do the full replace, but simply to find the pieces so that other code and do the replace. I am also not expecting it to convert the %% to the %, but leave it alone so that after the fact a simple search for %% can convert it to %.

回答1:

I believe you are looking for a regex pattern

(?<!%%)(?<=%)\w+(?=%)(?!%%)

That would find variables that are surrounded by a single % character on each side.

Test the regex here.


Java code:

final Pattern pattern = Pattern.compile("(?<!%%)(?<=%)\\w+(?=%)(?!%%)");
final Matcher matcher = pattern.matcher(input);

while (matcher.find()) {
    System.out.println(matcher.group(0));
}

Test the Java code here.


UPDATE:

If you want to catch groups as requested in your comment below, use the following pattern:

(?<!%)(%)(\w+)(%)(?!%)

Test this pattern here.


...and Java code:

final Pattern pattern = Pattern.compile("(?<!%)(%)(\\w+)(%)(?!%)");
final Matcher matcher = pattern.matcher(input);

while (matcher.find()) {
    System.out.println(matcher.group(1) + " | " + 
                       matcher.group(2) + " | " + 
                       matcher.group(3));
}    

Test this code here.



回答2:

If you want to match the variables in 3 capturing groups, and you don't want to match %% you could use an alternation | to match %% and capture variables using 3 capturing groups:

%%|(%)(\w+)(%)

Demo Java