php -> preg_replace -> remove space ONLY between q

2019-02-26 07:15发布

问题:

I'm trying to remove space ONLY between quotes like:

$text = 'good with spaces "here all spaces should be removed" and here also good';

can someone help with a working piece of code ? I already tried:

$regex = '/(\".+?\")|\s/';

or

$regex = '/"(?!.?\s+.?)/';

without success, and I found a sample that works in the wrong direction :-( Removing whitespace-characters, except inside quotation marks in PHP? but I can't change it.

thx Newi

回答1:

This kind of problem are easily solved with preg_replace_callback. The idea consists to extract the substring between quotes and then to edit it in the callback function:

$text = preg_replace_callback('~"[^"]*"~', function ($m) {
    return preg_replace('~\s~', '#', $m[0]);
}, $text);

It's the most simple way.


It's more complicated to do it with a single pattern with preg_replace but it's possible:

$text = preg_replace('~(?:\G(?!\A)|")[^"\s]*\K(?:\s|"(*SKIP)(*F))~', '#', $text);

demo

Pattern details:

(?:
    \G (?!\A)  # match the next position after the last successful match
  |
    "          # or the opening double quote
)
[^"\s]*        # characters that aren't double quotes or a whitespaces
\K             # discard all characters matched before from the match result
(?:
    \s         # a whitespace
  |
    "           # or the closing quote
    (*SKIP)(*F) # force the pattern to fail and to skip the quote position
                # (this way, the closing quote isn't seen as an opening quote
                # in the second branch.)
)

This way uses the \G anchors to ensure that all matched whitespaces are between the quotes.

Edge cases:

  • there's an orphan opening quote: In this case, all whitespaces from the last quote until the end of the string are replaced. But if you want you can change this behavior adding a lookahead to check if the closing quote exists:

    ~(?:\G(?!\A)|"(?=[^"]*"))[^"\s]*\K(?:\s|"(*SKIP)(*F))~

  • double quotes can contain escaped double quotes that have to be ignored: You have to describe escaped characters like this:

    ~(?:\G(?!\A)|")[^"\s\\\\]*+(?:\\\\\S[^"\s\\\\]*)*+(?:\\\\?\K\s|"(*SKIP)(*F))~


Other strategy suggested by @revo: check if the number of remaining quotes at a position is odd or even using a lookahead:

\s(?=[^"]*+(?:"[^"]*"[^"]*)*+")

It is a short pattern, but it can be problematic with long strings since for each position with a whitespace you have to check the string until the last quote with the lookahead.



回答2:

See the following code snippet:

<?php
$text = 'good with spaces "here all spaces should be removed" and here also good';
echo "$text \n";
$regex = '/(\".+?\")|\s/';
$regex = '/"(?!.?\s+.?)/';
$text = preg_replace($regex,'', $text);
echo "$text \n";
?>

I found a sample that works in the wrong direction :-(


@Graham: correct

$text = 'good with spaces "here all spaces should be removed" and here also good'
should be 
$text = 'good with spaces "hereallspacesshouldberemoved" and here also good';