I've been looking for a simple regex for URL's, does anybody have one handy that works well? I didn't find one with the zend framework validation classes and have seen several implementations.
Thanks
I've been looking for a simple regex for URL's, does anybody have one handy that works well? I didn't find one with the zend framework validation classes and have seen several implementations.
Thanks
I used this on a few projects, I don't believe I've run into issues, but I'm sure it's not exhaustive:
Most of the random junk at the end is to deal with situations like
http://domain.com.
in a sentence (to avoid matching the trailing period). I'm sure it could be cleaned up but since it worked. I've more or less just copied it over from project to project.As per John Gruber (Daring Fireball):
Regex:
using in preg_match():
Here is the extended regex pattern (with comments):
For more details please look at: http://daringfireball.net/2010/07/improved_regex_for_matching_urls
Peter's Regex doesn't look right to me for many reasons. It allows all kinds of special characters in the domain name and doesn't test for much.
Frankie's function looks good to me and you can build a good regex from the components if you don't want a function, like so:
Untested but I think that should work.
Also, Owen's answer doesn't look 100% either. I took the domain part of the regex and tested it on a Regex tester tool http://erik.eae.net/playground/regexp/regexp.html
I put the following line:
in the "regexp" section and the following line:
under the "sample text" section.
The result allowed the minus character through. Because \S means any non-space character.
Note the regex from Frankie handles the minus because it has this part for the first character:
Which won't allow the minus or any other special character.
Just in case you want to know if the url really exists:
Edit:
As incidence pointed out this code has been DEPRECATED with the release of PHP 5.3.0 (2009-06-30) and should be used accordingly.
Just my two cents but I've developed this function and have been using it for a while with success. It's well documented and separated so you can easily change it.
Use the
filter_var()
function to validate whether a string is URL or not:It is bad practice to use regular expressions when not necessary.
EDIT: Be careful, this solution is not unicode-safe and not XSS-safe. If you need a complex validation, maybe it's better to look somewhere else.