I wrote this function to convert all specific URLs(mywebsite.com) to links, and strip other URLs to @@@spam@@@.
function get_global_convert_all_urls($content) {
$content = strtolower($content);
$replace = "/(?:http|https)?(?:\:\/\/)?(?:www.)?(([A-Za-z0-9-]+\.)*[A-Za-z0-9-]+\.[A-Za-z]+)(?:\/.*)?/im";
preg_match_all($replace, $content, $search);
$total = count($search[0]);
for($i=0; $i < $total; $i++) {
$url = $search[0][$i];
if(preg_match('/mywebsite.com/i', $url)) {
$content = str_replace($url, '<a href="'.$url.'">'.$url.'</a>', $content);
} else {
$content = str_replace($url, '@@@spam@@@', $content);
}
}
return $content;
}
The only problem that i can't solve is, the regex not ending on space if 2 URLs in one line.
$content = "http://www.mywebsite.com/index.html http://www.others.com/index.html";
Result:
<a href="http://www.mywebsite.com/index.html http://www.others.com/index.html">http://www.mywebsite.com/index.html http://www.others.com/index.html</a>
How can i get the result below:
<a href="http://www.mywebsite.com/index.html">http://www.mywebsite.com/index.html</a> @@@spam@@@
I have tried add this (\s|$) at the ending of regex but no luck:
/(?:http|https)?(?:\:\/\/)?(?:www.)?(([A-Za-z0-9-]+\.)*[A-Za-z0-9-]+\.[A-Za-z]+)(?:\/.*)?(\s|$)/im
Change the regexp pattern to capture the last url section(
/index.html
,/index.php
).Change your function content as shown below:
The output:
Edited based on change in your question.
The problem is your .* at the end of your regex, so my suggestion is to replace it with a more precise expression. I cooked this up real quick, you'll want to some tests to verify your cases. =)
Results in:
Change the last element of the regex
(?:\/.*)?
into\S*
.Your regex matches every character till the end of the string including spaces,
\S*
matches every character that is not a space.You could also simplified the whole regex into: