java - Regex to split a string using spaces but no

2019-04-14 14:45发布

问题:

I want to split a string using spaces but not considering double quotes or single quotes.

I tried using Regex for splitting a string using space when not surrounded by single or double quotes but it failed in some cases.

Input : It is a "beautiful day"'but i' cannot "see it"

and the output should be

It
is
a
"beautiful day"'but i'
cannot
"see it"

The regex in above link resulted in

It
is
a
"beautiful day"
'but i'
cannot
"see it"

I want "beautiful day"'but i' in the one line.

Can somebody help me in writing the correct regex?

回答1:

This regex passes your test:

" (?=(([^'\"]*['\"]){2})*[^'\"]*$)"

It's splitting on a space, but only when the space is not inside quotes, which it tests by using a look ahead to assert that there is an even number of quotes following the space.

There are some edge cases this won't work for, but if your input is "well formed" (ie quotes are balanced) this will work for you. If quotes are not balanced, it is still doable - you would need to use two look aheads - one for each quote type.


Here's some test code:

String s = "It is a \"beautiful day\"'but i' cannot \"see it\"";
String[] parts = s.split(" (?=(([^'\"]*['\"]){2})*[^'\"]*$)");
for (String part : parts)
    System.out.println(part);

Output:

It
is
a
"beautiful day"'but i'
cannot
"see it"