Replace non-html links with tags

2020-04-21 05:12发布

问题:

I have a block of code that will take a block of text like the following:

Sample text sample text http://www.google.com sample text

Using the preg_replace_callback method and the following regular expression:

preg_replace_callback('/http:\/\/([,\%\w.\-_\/\?\=\+\&\~\#\$]+)/',
    create_function(
        '$matches',
        '$url = $matches[1]; 
        $anchorText = ( strlen($url) > 35 ? substr($url, 0, 35).\'...\' : $url); 
        return \'<a href="http://\'. $url .\'">\'. $anchorText .\'</a>\';'),
    $str);

Will convert the sample text to look like:

Sample text sample text < a href="http://www.google.com">http://www.google.com< /a> sample text

My problem now is that we have introduced a rich text editor that can create links before being sent to the script. I need to update this piece of code so that it will ignore any URLs that are already inside an tag.

回答1:

Add code to the beginning of the pattern to capture an opening anchor tag, and then do not perform the callback code when it has captured something:

/(<a[^>]*>)?http:\/\/([,\%\w.\-_\/\?\=\+\&\~\#\$]+)/

You will then need to add an if to your lamda function to see if there is anything in $matches[1] (Don't forget to increment your captures as well)

You cannot use a negative look behind assertion here as the capture is not a fixed length, but you could use a negative look ahead assertion for the closing tag so it drops the entire match:

/(<a[^>]*>)?http:\/\/([,\%\w.\-_\/\?\=\+\&\~\#\$]+)(?!<\/a>)/