PHP preg_replace() - Memory Issues. Alternative?

2019-08-09 12:53发布

问题:

In a site I'm working on, im converting strings to slugs using the answer in this question. It works, but I'm finding there are HUGE memory leak issues. I've done some research and found that this is just currently a bug in PHP.

Are there any alternatives to accomplish something like strings to slug?

EDIT:

There's another interesting angle to this problem. I'm re-developing a scraper that was made using regex (ugh, i know), so I decided to use DOMDocument / XPath as a solution.

The interesting thing is, the original regex scrape, also uses the above slugify() function, and there are no memory issues. However, once I setup the DOMDocument scrape, the scrape crashes halfway through and the error is always on the preg_replace() line in the slugify() function above.

So despite both scenarios using the exact same slugify() function, only the DOMDocument version crashes on the preg_replace() line

回答1:

Preg_replace is pretty good for this, but an alternative is to hack them out using http://php.net/manual/en/function.str-replace.php



回答2:

By unsetting the variable, you should be able to free up some memory. Yes it's dirty but might work

static public function slugify($text) {    
  // replace non letter or digits by -   
  $text2 = preg_replace('~[^\\pL\d]+~u', '-', $text);

  // unset $text to free up space
  unset($text);
  // trim   
  $text2 = trim($text2, '-');

  // transliterate   
  $text2 = iconv('utf-8', 'us-ascii//TRANSLIT', $text2);

  // lowercase
  $text2 = strtolower($text2);

  // remove unwanted characters
  $text = preg_replace('~[^-\w]+~', '', $text2);

  // unset $text2 to free up space
  unset($text2);

  if (empty($text))   {
    return 'n-a';   
  }
  return $text; 
}

https://bugs.php.net/bug.php?id=35258&edit=1

http://www.php.net/manual/en/function.preg-replace.php#84285

Hopefully you find a cleaner solution.



回答3:

I found this bug https://bugs.php.net/bug.php?id=38728 and it says to use the mb_eregi_replace() function insead.

It worked for me.