I am using Python to extract the filename from a link using rfind like below:
url = "http://www.google.com/test.php"
print url[url.rfind("/") +1 : ]
This works ok with links without a / at the end of them and returns "test.php". I have encountered links with / at the end like so "http://www.google.com/test.php/". I am have trouble getting the page name when there is a "/" at the end, can anyone help?
Cheers
There is a library called urlparse that will parse the url for you, but still doesn't remove the / at the end so one of the above will be the best option
You could use
Filenames with a slash at the end are technically still path definitions and indicate that the index file is to be read. If you actually have one that' ends in
test.php/
, I would consider that an error. In any case, you can strip the / from the end before running your code as follows:Just for fun, you can use a Regexp:
Just removing the slash at the end won't work, as you can probably have a URL that looks like this:
...in which case you'll get back "hey.xml". Instead of manually checking for this, you can use urlparse to get rid of the parameters, then do the check other people suggested:
(But urlparse is probably more readable, even if more verbose.)