Slicing URL with Python

2019-01-22 22:55发布

I am working with a huge list of URL's. Just a quick question I have trying to slice a part of the URL out, see below:

http://www.domainname.com/page?CONTENT_ITEM_ID=1234&param2&param3

How could I slice out:

http://www.domainname.com/page?CONTENT_ITEM_ID=1234

Sometimes there is more than two parameters after the CONTENT_ITEM_ID and the ID is different each time, I am thinking it can be done by finding the first & and then slicing off the chars before that &, not quite sure how to do this tho.

Cheers

10条回答
放荡不羁爱自由
2楼-- · 2019-01-22 23:43

I figured it out below is what I needed to do:

url = "http://www.domainname.com/page?CONTENT_ITEM_ID=1234&param2&param3"
url = url[: url.find("&")]
print url
'http://www.domainname.com/page?CONTENT_ITEM_ID=1234'
查看更多
Rolldiameter
3楼-- · 2019-01-22 23:44

This method isn't dependent on the position of the parameter within the url string. This could be refined, I'm sure, but it gets the point across.

url = 'http://www.domainname.com/page?CONTENT_ITEM_ID=1234&param2&param3'
parts = url.split('?')
id = dict(i.split('=') for i in parts[1].split('&'))['CONTENT_ITEM_ID']
new_url = parts[0] + '?CONTENT_ITEM_ID=' + id
查看更多
萌系小妹纸
4楼-- · 2019-01-22 23:45

Another option would be to use the split function, with & as a parameter. That way, you'd extract both the base url and both parameters.

   url.split("&") 

returns a list with

  ['http://www.domainname.com/page?CONTENT_ITEM_ID=1234', 'param2', 'param3']
查看更多
再贱就再见
5楼-- · 2019-01-22 23:55

beside urlparse there is also furl, which has IMHO better API.

查看更多
登录 后发表回答