Problem
The user can submit a form where he can submit a link to sitea.com
. Now what I want to do is check if the user actually submitted an URL coming from sitea.com
What I've tried
I've tried to check if the URL is correct (using regex), and contains sitea.com
. But that contains gaps, as anyone can add ?haha=sitea.com
to an URL and still have a match. And 'cause I'm no master in regex, my "solution" ends here.
My question
Is it possible to check if $_POST['url']
is actually a link to sitea.com
?
I think it's best parse_url()
here. Regex may work, but it's best to avoid using regex when a built-in function is available.
I'd do something like:
$url = '...';
$domain = implode('.', array_slice(explode('.', parse_url($url, PHP_URL_HOST)), -2));
if ($domain == 'sitea.com') {
# code...
}
As a function:
function getDomain($url)
{
$domain = implode('.', array_slice(explode('.', parse_url($url, PHP_URL_HOST)), -2));
if ($domain == 'sitea.com') {
return True;
} else {
return False;
}
}
Test cases:
var_dump(getDomain('http://sitea.com/'));
var_dump(getDomain('http://sitea.com/directory'));
var_dump(getDomain('http://subdomain.sitea.com/'));
var_dump(getDomain('http://sub.subdomain.sitea.com/#test'));
var_dump(getDomain('http://subdomain.notsitea.com/#dsdf'));
var_dump(getDomain('http://sitea.somesite.com'));
var_dump(getDomain('http://example.com/sitea.com'));
var_dump(getDomain('http://sitea.example.com/test.php?haha=sitea.com'));
Output:
bool(true)
bool(true)
bool(true)
bool(true)
bool(false)
bool(false)
bool(false)
bool(false)
Demo!
This might not be a job for regexes, but for existing tools in your language of choice. Regexes are not a magic wand you wave at every problem that happens to involve strings. You probably want to use existing code that has already been written, tested, and debugged.
In PHP, use the parse_url
function.
Perl: URI
module.
Ruby: URI
module.
.NET: 'Uri' class