What is the best way to clean a string for placeme

2019-02-07 05:35发布

问题:

I'm looking to create a URL string like the one SO uses for the links to the questions. I am not looking at rewriting the url (mod_rewrite). I am looking at generating the link on the page.

Example: The question name is:

Is it better to use ob_get_contents() or $text .= ‘test’;

The URL ends up being:

http://stackoverflow.com/questions/292068/is-it-better-to-use-obgetcontents-or-text-test

The part I'm interested in is:

is-it-better-to-use-obgetcontents-or-text-test

So basically I'm looking to clean out anything that is not alphanumeric while still keeping the URL readable. I have the following created, but I'm not sure if it's the best way or if it covers all the possibilities:

$str = urlencode(
    strtolower(
    str_replace('--', '-', 
    preg_replace(array('/[^a-z0-9 ]/i', '/[^a-z0-9]/i'), array('', '-'), 
    trim($urlPart)))));

So basically:

  1. trim
  2. replace any non alphanumeric plus the space with nothing
  3. then replace everything not alphanumeric with a dash
  4. replace -- with -.
  5. strtolower()
  6. urlencode() -- probably not needed, but just for good measure.

回答1:

As you pointed out already, urlencode() is not needed in this case and neither is trim(). If I understand correctly, step 4 is to avoid multiple dashes in a row, but it will not prevent more than two dashes. On the other hand, dashes connecting two words (like in "large-scale") will be removed by your solution while they seem to be preserved on SO.

I'm not sure that this is really the best way to do it, but here's my suggestion:

$str = strtolower( 
  preg_replace( array('/[^a-z0-9\- ]/i', '/[ \-]+/'), array('', '-'), 
  $urlPart ) );

So:

  1. remove any character that is neither space, dash, nor alphanumeric
  2. replace any consecutive number of spaces or dashes with a single dash
  3. strtolower()