Ok lets say I have a URL
example.com/hello/world/20111020 (with or without the trailing slash).
What I would like to do is strip from the url the domain example.com. and then break the hello world 20111020 into an array. But my other problem is. Sometimes the URL has no /hello/world/20111020 or just /hello/ so I need to first determine if there is anything after example.com if there not, then do nothing as obviously there's nothing to work with. However if there is something there for each / I need to add it to this array in order. So I can work with the array[0] and know it was hello.
I tried something a couple days back but was running into issues with trailing slashes it kept breaking the script, I unfortunately abandoned that idea. And today I am looking for fresh ideas.
This should work
var url = 'example.com/hello/world/20111020/';
//get rid of the trailing / before doing a simple split on /
var url_parts = url.replace(/\/\s*$/,'').split('/');
//since we do not need example.com
url_parts.shift();
Now url_parts
will point to the array ["hello", "world", "20111020"]
.
You can use the jQuery-URL-Parser plugin:
var file = $.url.attr("file");
In your case you'd probably want to use segment()
:
var segments = $.url('http://allmarkedup.com/folder/dir/example/index.html').segment();
// segments = ['folder','dir','example','index.html']
<script type="text/javascript">
function splitThePath(incomingUrl){
var url = document.createElement("a");
url.href = incomingUrl;
//url.hash Returns the anchor portion of a URL
//url.host Returns the hostname and port of a URL
//url.hostname Returns the hostname of a URL
//url.href Returns the entire URL
//url.pathname Returns the path name of a URL
//url.port Returns the port number the server uses for a URL
//url.protocol Returns the protocol of a URL
//url.search Returns the query portion of a URL
if(url.pathname && url.pathname != ""){
var pathnameArray = url.pathname.split("/");
}else{
}
}
</script>
I have created the following regular expression for URLs
^https?://(((0|([1-9][0-9]{0,1}))(\.(0|([1-9][0-9]{0,1}))){3})|([a-zA-Z]([a-zA-Z0-9$\-_@\.&+!*"\'\(\),]|(%[0-9a-fA-F][0-9a-fA-F]))*(\.([a-zA-Z]([a-zA-Z0-9$\-_@\.&+!*"\'\(\),]|(%[0-9a-fA-F][0-9a-fA-F]))*))*))(/|((/([a-zA-Z]([a-zA-Z0-9$\-_@\.&+!*"\'\(\),]|(%[0-9a-fA-F][0-9a-fA-F]))*))*))$
It has been written for MySql - I am sure with a bit of fiddling you can get it you work for your needs.
BTW - I took the idea from an RFC - The number escapes me at this moment
For parsing URLs, one different approach can be using anchor DOM object.
var a = document.createElement("A");
a.href = 'http://example.com:8080/path/to/resources?param1=val1¶ms2=val2#named-anchor';
a.protocol; // http:
a.host; // example.com:8080
a.hostname; //example.com
a.port; // 8080 (in case of port 80 empty string returns)
a.pathname; // /path/to/resources
a.hash; // #named-anchor
a.search // ?param1=val1¶ms2=val2