Sanitize sentence in php

2019-03-13 08:14发布

问题:

The title may sound odd, but im kind of trying to set up this preg_replace that takes care of messy writers for a textarea. It has to:

  1. if there is an exclamation sign, there should not be another one in a row.
  2. if there is a ., the comma wins and it has to be ,
  3. when there is one+ spaces before a coma, it should be reduced to nothing.
  4. the sentence cannot start or end with a comma.
  5. there should never be more than 2 of the same letters joined together.
  6. a space must be always present after a comma.

E.g.:

  • ,My house, which is green., is nice!
  • My house..., which is green, is nice!!!
  • My house ,which is green,,, is nice!!

The end result should always be:

My house, which is green, is nice!

Is there an already built regex that takes care of this?

Solution check out FakeRainBrigand's solution below!

回答1:

I might have to use this for my own sites... nice idea!

<?php

$text = 'My hooouse..., which is greeeeeen , is nice!!!  ,And pretty too...';

$pats = array(
'/([.!?]\s{2}),/', # Abc.  ,Def
'/\.+(,)/',  # ......,
'/(!)!+/',   # abc!!!!!!!!
'/\s+(,)/',  # abc   , def
'/([a-zA-Z])\1\1/', # greeeeeeen
'/,(?!\s)/'); 

$fixed = preg_replace($pats, '$1', $text);

echo $fixed;
echo "\n\n";

?>

And the 'modified' version of $text: "My house, which is green, is nice! And pretty too."

UPDATE: Here's the version that handles "abc,def" -> "abc, def".

<?php

$text = 'My hooouse..., which is greeeeeen ,is nice!!!  ,And pretty too...';

$pats = array(
'/([.!?]\s{2}),/', # Abc.  ,Def
'/\.+(,)/',        # ......,
'/(!)!+/',         # abc!!!!!!!!
'/\s+(,)/',        # abc   , def
'/([a-zA-Z])\1\1/');      # greeeeeeen

$fixed = preg_replace($pats, '$1', $text);
$really_fixed = preg_replace('/,(?!\s)/', ', ', $fixed);

echo $really_fixed;
echo "\n\n";
?>

I would think this is a bit slower since it's an additional function call.



回答2:

 - $result = preg_replace('/!+/', '!', $subject);
 - $result = preg_replace('/\.*,/', ',', $subject);
 - $result = preg_replace('/\s+(?=,)/', '', $subject);
 - $result = preg_replace('/^,*|,*$/', '', $subject);
 - $result = preg_replace('/([a-z])\1+/i', '$1$1', $subject);
 - $result = preg_replace('/,(?!\s)/', ', ', $subject);

One by one matching to your rules :)