Assume a one-line string with multiple consecutive key-value pairs, separated by a space, but with space allowed also within values (not in keys), e.g.
key1=one two three key2=four key3=five six key4=seven eight nine ten
Correctly extracting the key-value pairs from above would produce the following mappings:
"key1", "one two"
"key2", "four"
"key3", "five six"
"key4", "seven eight nine ten"
where "keyX" can be any sequence of characters, excluding space.
Trying something simple, like
([^=]+=[^=]+)+
or similar variations is not adequate.
Is there a regex to fully handle such extraction, without any further string processing?
Rather then a regular expression, I suggest you parse it using
indexOf
. Something like,Output is
Something like this is also possible if whitespaces are not duplicated:
otherwise you can always write this:
These patterns are a good solution if key names are not too long since they use the backtracking.
you can also write this that needs at most one step of backtracking:
Try with a lookahead:
As a Java String:
Test at regex101.com; Test at regexplanet (click on "Java")
\1
contains the key and\2
the value:Escape
\
with\\
in Java:Demo: https://regex101.com/r/dO8kM2/1