Lets say that $content is the content of a textarea
/*Convert the http/https to link */
$content = preg_replace('!((https://|http://)+[a-z0-9_./?=&-]+)!i', '<a target="_blank" href="$1">$1</a> ', nl2br($_POST['helpcontent'])." ");
/*Convert the www. to link prepending http://*/
$content = preg_replace('!((www\.)+[a-z0-9_./?=&-]+)!i', '<a target="_blank" href="http://$1">$1</a> ', $content." ");
This was working ok for links, but realised that it was breaking the markup when an image is within the text...
I am trying like this now:
$content = preg_replace('!\s((https?://|http://)+[a-z0-9_./?=&-]+)!i', ' <a href="$1">$1</a> ', nl2br($_POST['content'])." ");
$content = preg_replace('!((www\.)+[a-z0-9_./?=&-]+)!i', '<a target="_blank" href="http://$1">$1</a> ', $content." ");
As is the images are respected, but the problem is that url's with http:// or https:// format won't be converted now..:
google.com -> Not converted (as expected)
www.google.com -> Well Converted
http://google.com -> Not converted (unexpected)
https://google.com -> Not converted (unexpected)
What am I missing?
-EDIT-
Current almost working solution:
$content = preg_replace('!(\s|^)((https?://)+[a-z0-9_./?=&-]+)!i', ' <a href="$2" target="_blank">$2</a> ', nl2br($_POST['content'])." ");
$content = preg_replace('!(\s|^)((www\.)+[a-z0-9_./?=&-]+)!i', '<a target="_blank" href="http://$2" target="_blank">$2</a> ', $content." ");
The thing here is that if this is the input:
www.funcook.com http://www.funcook.com https://www.funcook.com
funcook.com http://funcook.com https://funcook.com
All the urls I want (all, except name.domain) are converted as expected, but this is the output
www.funcook.com http://www.funcook.com https://www.funcook.com ;
funcook.com http://funcook.com https://funcook.com
Note an ; is inserted, any idea why?
try this:
preg_replace('!(\s|^)((https?://|www\.)+[a-z0-9_./?=&-]+)!i', ' <a href="$2">$2</a> ',$text);
It will pick up links beginning with http:// or with www.
Example
You can't at 100%. Becuase there may be links such as stackoverflow.com
which do not have www.
.
If you're only targeting those links:
!(www\.\S+)!i
Should work well enough for you.
EDIT: As for your newest question, as to why http links don't get converted but https do, Your first pattern only searches for https://
, or http://.
which isn't the case. Simplify it by replacing:
(https://|http://\.)
With
(https?://)
Which will make the s
optional.
Another method to go about adding hyperlinks is that you could take the text that you want to parse for links, and explode it into an array. Then loop through it using foreach (very fast function - http://www.phpbench.com/) and change anything that starts with http://, or https://, or www., or ends with .com/.org/etc into a link.
I'm thinking maybe something like this:
$userTextArray = explode(" ",$userText);
foreach( $userTextArray as &$word){
//if statements to test if if it starts with www. or ends with .com or whatever else
//change $word so that it is a link
}
Your changes will be reflected in the array since you had the "&" before $userText in your foreach statement.
Now just implode the array back into a string and you're good to go.
This made sense in my head... But I'm not 100% sure that this is what you're looking for
I had similar problem. Here is function which helped me. Maybe it will fit your needs to:
function clHost($Address) {
$parseUrl = parse_url(trim($Address));
return str_replace ("www.","",trim(trim($parseUrl[host] ? $parseUrl[host].$parseUrl[path] : $parseUrl[path]),'/'));
}
This function will return domain without protocol and "www", so you can add them yourself later.
For example:
$url = "http://www.". clHost($link);
I did it like that, because I couldn't find good regexp.
\s((https?://|www.)+[a-z0-9_./?=&-]+)
The problem is that your starting \s is forcing the match to start with a space, so, if you don't have that starting space your match fails. The reg exp is fine (without the \s), but to avoid replacing the images you need to add something to avoid matching them.
If the images are pure html use this:
(?<!src=")((https?://|www.)+[a-z0-9_./?=&-]+)
That will look for src=" before the url, to ignore it.
If you use another mark up, tell me and I'll try to find another way to avoid the images.