I've made a function (in JavaScript) that takes an URL from either YouTube or Vimeo. It figures out the provider and ID for that particular video (demo: http://jsfiddle.net/csjwf/).
function parseVideoURL(url) {
var provider = url.match(/http:\/\/(:?www.)?(\w*)/)[2],
id;
if(provider == "youtube") {
id = url.match(/http:\/\/(?:www.)?(\w*).com\/.*v=(\w*)/)[2];
} else if (provider == "vimeo") {
id = url.match(/http:\/\/(?:www.)?(\w*).com\/(\d*)/)[2];
} else {
throw new Error("parseVideoURL() takes a YouTube or Vimeo URL");
}
return {
provider : provider,
id : id
}
}
It works, however as a regex Novice, I'm looking for ways to improve it. The input I'm dealing with, typically looks like this:
http://vimeo.com/(id)
http://youtube.com/watch?v=(id)&blahblahblah.....
1) Right now I'm doing three separate matches, would it make sense to try and do everything in one single expression? If so, how?
2) Could the existing matches be more concise? Are they unnecessarily complex? or perhaps insufficient?
3) Are there any YouTube or Vimeo URL's that would fail being parsed? I've tried quite a few and so far it seems to work pretty well.
To summarize: I'm simply looking for ways improve the above function. Any advice is greatly appreciated.
Here is my regex
http://jsfiddle.net/csjwf/1/
I am not sure about your question 3), but provided that your induction on the url forms is correct, the regexes can be combined into one as follows:
You will get the match under different positions (1st and 2nd matches if vimeo, 3rd and 4th matches if youtube), so you just need to handle that.
Or, if you are quite sure that vimeo's id only includes numbers, then you can do:
and the provider and the id will apprear under 1st and 2nd match, respcetively.
Just in case here is a php version
For Vimeo, Don't rely on Regex as Vimeo tends to change/update their URL pattern every now and then. As of October 2nd, 2017, there are in total of six URL schemes Vimeo supports.
Instead, use their API to validate vimeo URLs. Here is this oEmbed (doc) API which takes an URL, checks its validity and return a object with bunch of video information(check out the dev page). Although not intended but we can easily use this to validate whether a given URL is from Vimeo or not.
So, with ajax it would look like this,
Here's my attempt at the regex, which covers most updated cases:
3) Your regex does not match https url's. I haven't tested it, but I guess the "http://" part would become "http(s)?://". Note that this would change the matching positions of the provider and id.