I'm looking for a regex that will allow me to validate whether or not a string is the reference to a website address, or a specific page in that website.
So it would match:
http://google.com
ftp://google.com
http://google.com/
http://lots.of.subdomains.google.com
But not:
http://google.com/search.whatever
ftp://google.com/search.whatever
http://lots.of.subdomains.google.com/search.whatever
Any ideas? I can't quite figure out how to handle allowing the /
at the end of the URL.
Great answer by Jeremy. Depending on which regex dialect you're using to match, you might want to wrap the whole expression with anchors (to avoid matching URLs like
http://example.com/bin/cgi?returnUrl=http://google.com
), and maybe generalize the valid protocol and domain name characters:Try this:
This is a shortened version of my full URI validation pattern, based on the specification. I wrote this because the specification allows many characters never included in any validation pattern I've found on the web. You'll see that the user/pass (and in the second pattern, path and query string) are far more permissive than you'd have thought.
And since I've taken the time to break this out to be somewhat more readable, here is the complete pattern:
Note that some (all?) javascript implementations do not support comments in regular expressions.