Regex for external links: how to make an exception

2019-06-09 13:45发布

I am using regex to change plain text to url if there is http:// before the text. This works fine, but I don't want to make them a link if this link is internal (so a link that contains my websites name)... So I only want it to happen if it is an external link.

How can I do that? I tried adding a ! before the http, but it did not work. Can someone help me out please? This is what I am using:

function wpse107488_urls_to_links( $string ) {

   $string = preg_replace( "/([^\w\/])(www\.[a-z0-9\-]+\.[a-z0-9\-]+)/i", "$1http://$2", $string );

    $string = preg_replace( "/([\w]+:\/\/[\w-?&;%#~=\.\/\@]+[\w\/])/i", "<a target=\"_blank\" title=\"" . __( 'Visit Site', 'your-textdomain' ) . "\" href=\"$1\">$1</a>", $string);

    return $string;
}

Edit: I am using a different function that makes a link from my internal links too (that is supposed to be so), but I think these two functions are blocking each other. That's why I gave a class to all my internal links. Can I exclude them using these classes?

2条回答
爷的心禁止访问
2楼-- · 2019-06-09 14:16

You should be able to use the following:

/((http|https):\/\/(?!www.google.co.uk)[\w\.\/\-=?#]+)/ for http and https

OR

/(http:\/\/(?!www.google.co.uk)[\w\.\/\-=?#]+)/ fot http only

You can then replace www.google.co.uk with your domain name (in the format that it is shown on your site).

Used on the following it will match all URLs except for http://www.google.co.uk...

A few websites to test the regex http://www.google.co.uk http://www.myspace.com http://facebook.com http://www.youtube.com/watch?v=video32 it should have matched all but the google URL.

The above regex will also match youtube videos etc. with GET strings attached and internal links (i.e. #)

Update

The following regex will replace all external links starting either http:// or www with an anchor tag to the URL opening in a new window/tab.

$string = preg_replace( "/((http:\/\/|www)(?!mydomain\.com)[\w\.\/\-=?#]+)/", "<a target='_blank\' href='$1'>$1</a>", $string);
查看更多
forever°为你锁心
3楼-- · 2019-06-09 14:39

You can add a negative lookahead (?!..) (not followed by) with your domain name, an example in a super general pattern* to detect urls:

$string = preg_replace('~\bhttps?://(?:www\.)?(?!mydomain.com)[^\s/]+(?:/[^\s/]+)*/?~i',
                       '<a href="$0">$0</a>', $string );

* that means that i didn't spent time to an url pattern, if you find a better pattern, change it. It is only to illustrate how to exclude your domain.

查看更多
登录 后发表回答