I'm using the following Regex to match all types of URL in PHP (It works very well):
$reg_exUrl = "%\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))%s";
But now, I want to exclude Youtube, youtu.be and Vimeo URLs:
I'm doing something like this after researching, but it is not working:
$reg_exUrl = "%\b(([\w-]+://?|www[.])(?!youtube|youtu|vimeo)[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))%s";
I want to do this, because I have another regex that match Youtube urls which returns an iframe and this regex is causing confusion between the two Regex.
Any help would be gratefully appreciated, thanks.
socodLib, to exclude something from a string, place yourself at the beginning of the string by anchoring with a
^
(or use another anchor) and use a negative lookahead to assert that the string doesn't contain a word, like so:Before we make the regex look too complex by concatenating it with yours, let;s see what we would do if you wanted to match some word characters
\w+
but not youtube or google, you would write:As you can see, after the assertion (where we say what we don't want), we say what we do want by using the \w+
In your case, let's add a negative lookahead to your initial regex (which I have not tuned):
I took the liberty of making the regex case insensitive with
(?i)
. You could also have addedi
to yours
modifier at the end. Theyoutu\.?be
expression allows for an optional dot.I am certain you can apply this recipe to your expression and other regexes in the future.
Reference