Remove Part of String Before the Last Forward Slas

The program I am currently working on retrieves URLs from a website and puts them into a list. What I want to get is the last section of the URL.

So, if the first element in my list of URLs is "https://docs.python.org/3.4/tutorial/interpreter.html" I would want to remove everything before "interpreter.html".

Is there a function, library, or regex I could use to make this happen? I've looked at other Stack Overflow posts but the solutions don't seem to work.

These are two of my several attempts:

for link in link_list:
   file_names.append(link.replace('/[^/]*$',''))
print(file_names)

for link in link_list:
   file_names.append(link.rpartition('//')[-1])
print(file_names)

标签： python regex string replace

6条回答

我只想做你的唯一

2楼-- · 2020-05-26 17:15

That doesn't need regex.

import os

for link in link_list:
    file_names.append(os.path.basename(link))

0人赞添加讨论(0) 举报

走好不送

3楼-- · 2020-05-26 17:24

You can use rpartition():

>>> s = 'https://docs.python.org/3.4/tutorial/interpreter.html'
>>> s.rpartition('/')
('https://docs.python.org/3.4/tutorial', '/', 'interpreter.html')

And take the last part of the 3 element tuple that is returned:

>>> s.rpartition('/')[2]
'interpreter.html'

0人赞添加讨论(0) 举报

forever°为你锁心

4楼-- · 2020-05-26 17:24

Here's a more general, regex way of doing this:

    re.sub(r'^.+/([^/]+)$', r'\1', "http://test.org/3/files/interpreter.html")
    'interpreter.html'

0人赞添加讨论(0) 举报

Lonely孤独者°

5楼-- · 2020-05-26 17:24

This should work if you plan to use regex

 for link in link_list:
    file_names.append(link.replace('.*/',''))
 print(file_names)

0人赞添加讨论(0) 举报

再贱就再见

6楼-- · 2020-05-26 17:30

Just use string.split:

url = "/some/url/with/a/file.html"

print url.split("/")[-1]

# Result should be "file.html"

split gives you an array of strings that were separated by "/". The [-1] gives you the last element in the array, which is what you want.

0人赞添加讨论(0) 举报

Viruses.

7楼-- · 2020-05-26 17:34

Have a look at str.rsplit.

>>> s = 'https://docs.python.org/3.4/tutorial/interpreter.html'
>>> s.rsplit('/',1)
['https://docs.python.org/3.4/tutorial', 'interpreter.html']
>>> s.rsplit('/',1)[1]
'interpreter.html'

And to use RegEx

>>> re.search(r'(.*)/(.*)',s).group(2)
'interpreter.html'

Then match the 2nd group which lies between the last / and the end of String. This is a greedy usage of the greedy technique in RegEx.

Regular expression visualization

Debuggex Demo

Small Note - The problem with link.rpartition('//')[-1] in your code is that you are trying to match // and not /. So remove the extra / as in link.rpartition('/')[-1].

0人赞添加讨论(0) 举报

Remove Part of String Before the Last Forward Slas

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间