PHP validation/regex for URL

2018-12-31 04:30发布

I've been looking for a simple regex for URL's, does anybody have one handy that works well? I didn't find one with the zend framework validation classes and have seen several implementations.


2楼-- · 2018-12-31 04:37

I don't think that using regular expressions is a smart thing to do in this case. It is impossible to match all of the possibilities and even if you did, there is still a chance that url simply doesn't exist.

Here is a very simple way to test if url actually exists and is readable :

if (preg_match("#^https?://.+#", $link) and @fopen($link,"r")) echo "OK";

(if there is no preg_match then this would also validate all filenames on your server)

3楼-- · 2018-12-31 04:38

I've used this one with good success - I don't remember where I got it from

$pattern = "/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i";
4楼-- · 2018-12-31 04:39

Here's a simple class for URL Validation using RegEx and then cross-references the domain against popular RBL (Realtime Blackhole Lists) servers:


require 'URLValidation.php';


require 'URLValidation.php';
$urlVal = new UrlValidation(); //Create Object Instance

Add a URL as the parameter of the domain() method and check the the return.

$urlArray = ['', '', ''];
foreach ($urlArray as $k=>$v) {

    echo var_dump($urlVal->domain($v)) . ' URL: ' . $v . '<br>';



bool(false) URL:
bool(true) URL:
bool(true) URL:

As you can see above, is listed as malicious website via an RBL so the domain was returned as false.

5楼-- · 2018-12-31 04:44
function is_valid_url ($url="") {

        if ($url=="") {

        $url = @parse_url($url);

        if ( ! $url) {

            return false;

        $url = array_map('trim', $url);
        $url['port'] = (!isset($url['port'])) ? 80 : (int)$url['port'];
        $path = (isset($url['path'])) ? $url['path'] : '';

        if ($path == '') {
            $path = '/';

        $path .= ( isset ( $url['query'] ) ) ? "?$url[query]" : '';

        if ( isset ( $url['host'] ) AND $url['host'] != gethostbyname ( $url['host'] ) ) {
            if ( PHP_VERSION >= 5 ) {
                $headers = get_headers("$url[scheme]://$url[host]:$url[port]$path");
            else {
                $fp = fsockopen($url['host'], $url['port'], $errno, $errstr, 30);

                if ( ! $fp ) {
                    return false;
                fputs($fp, "HEAD $path HTTP/1.1\r\nHost: $url[host]\r\n\r\n");
                $headers = fread ( $fp, 128 );
                fclose ( $fp );
            $headers = ( is_array ( $headers ) ) ? implode ( "\n", $headers ) : $headers;
            return ( bool ) preg_match ( '#^HTTP/.*\s+[(200|301|302)]+\s#i', $headers );

        return false;
6楼-- · 2018-12-31 04:45

OK, so this is a little bit more complex then a simple regex, but it allows for different types of urls.


All which should be marked as valid.

function is_valid_url($url) {
    // First check: is the url just a domain name? (allow a slash at the end)
    $_domain_regex = "|^[A-Za-z0-9-]+(\.[A-Za-z0-9-]+)*(\.[A-Za-z]{2,})/?$|";
    if (preg_match($_domain_regex, $url)) {
        return true;

    // Second: Check if it's a url with a scheme and all
    $_regex = '#^([a-z][\w-]+:(?:/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))$#';
    if (preg_match($_regex, $url, $matches)) {
        // pull out the domain name, and make sure that the domain is valid.
        $_parts = parse_url($url);
        if (!in_array($_parts['scheme'], array( 'http', 'https' )))
            return false;

        // Check the domain using the regex, stops domains like "" passing through
        if (!preg_match($_domain_regex, $_parts['host']))
            return false;

        // This domain looks pretty valid. Only way to check it now is to download it!
        return true;

    return false;

Note that there is a in_array check for the protocols that you want to allow (currently only http and https are in that list).

var_dump(is_valid_url(''));         // true
var_dump(is_valid_url(''));        // true
var_dump(is_valid_url(''));  // true
var_dump(is_valid_url('')); // true
var_dump(is_valid_url('')); // true
7楼-- · 2018-12-31 04:47

Here is the way I did it. But I want to mentoin that I am not so shure about the regex. But It should work thou :)

$pattern = "#((http|https)://(\S*?\.\S*?))(\s|\;|\)|\]|\[|\{|\}|,|”|\"|'|:|\<|$|\.\s)#i";
        $text = preg_replace_callback($pattern,function($m){
                return "<a href=\"$m[1]\" target=\"_blank\">$m[1]</a>$m[4]";

This way you won't need the eval marker on your pattern.

Hope it helps :)

登录 后发表回答