How to get everything after last slash in a URL?

2019-01-21 07:25发布

How can I extract whatever follows the last slash in a URL in Python? For example, these URLs should return the following:

URL: http://www.test.com/TEST1
returns: TEST1

URL: http://www.test.com/page/TEST2
returns: TEST2

URL: http://www.test.com/page/page/12345
returns: 12345

I've tried urlparse, but that gives me the full path filename, such as page/page/12345.

11条回答
\"骚年 ilove
2楼-- · 2019-01-21 08:02

partition and rpartition are also handy for such things:

url.rpartition('/')[2]
查看更多
【Aperson】
3楼-- · 2019-01-21 08:04

urlparse is fine to use if you want to (say, to get rid of any query string parameters).

import urllib.parse

urls = [
    'http://www.test.com/TEST1',
    'http://www.test.com/page/TEST2',
    'http://www.test.com/page/page/12345',
    'http://www.test.com/page/page/12345?abc=123'
]

for i in urls:
    url_parts = urllib.parse.urlparse(i)
    path_parts = url_parts[2].rpartition('/')
    print('URL: {}\nreturns: {}\n'.format(i, path_parts[2]))

Output:

URL: http://www.test.com/TEST1
returns: TEST1

URL: http://www.test.com/page/TEST2
returns: TEST2

URL: http://www.test.com/page/page/12345
returns: 12345

URL: http://www.test.com/page/page/12345?abc=123
returns: 12345
查看更多
冷血范
4楼-- · 2019-01-21 08:11

You don't need fancy things, just see the string methods in the standard library and you can easily split your url between 'filename' part and the rest:

url.rsplit('/', 1)

So you can get the part you're interested in simply with:

url.rsplit('/', 1)[-1]
查看更多
走好不送
5楼-- · 2019-01-21 08:13

You cand do like this:

head, tail = os.path.split(url)

Where tail will be your file name.

查看更多
三岁会撩人
6楼-- · 2019-01-21 08:13
url ='http://www.test.com/page/TEST2'.split('/')[4]
print url

Output: TEST2.

查看更多
登录 后发表回答