Python Find Question

2019-07-20 17:53发布

I am using Python to extract the filename from a link using rfind like below:

url = "http://www.google.com/test.php"

print url[url.rfind("/") +1 : ]

This works ok with links without a / at the end of them and returns "test.php". I have encountered links with / at the end like so "http://www.google.com/test.php/". I am have trouble getting the page name when there is a "/" at the end, can anyone help?

Cheers

标签: python url
7条回答
三岁会撩人
2楼-- · 2019-07-20 18:18

There is a library called urlparse that will parse the url for you, but still doesn't remove the / at the end so one of the above will be the best option

查看更多
看我几分像从前
3楼-- · 2019-07-20 18:20

You could use

print url[url.rstrip("/").rfind("/") +1 : ]
查看更多
走好不送
4楼-- · 2019-07-20 18:25

Filenames with a slash at the end are technically still path definitions and indicate that the index file is to be read. If you actually have one that' ends in test.php/, I would consider that an error. In any case, you can strip the / from the end before running your code as follows:

url = url.rstrip('/')
查看更多
贪生不怕死
5楼-- · 2019-07-20 18:26

Just for fun, you can use a Regexp:

import re
print re.search('/([^/]+)/?$', url).group(1)
查看更多
Viruses.
6楼-- · 2019-07-20 18:28

Just removing the slash at the end won't work, as you can probably have a URL that looks like this:

http://www.google.com/test.php?filepath=tests/hey.xml

...in which case you'll get back "hey.xml". Instead of manually checking for this, you can use urlparse to get rid of the parameters, then do the check other people suggested:

from urlparse import urlparse
url = "http://www.google.com/test.php?something=heyharr/sir/a.txt"
f = urlparse(url)[2].rstrip("/")
print f[f.rfind("/")+1:]
查看更多
姐就是有狂的资本
7楼-- · 2019-07-20 18:30
filter(None, url.split('/'))[-1]

(But urlparse is probably more readable, even if more verbose.)

查看更多
登录 后发表回答