Get the subdomain from a URL-第3页回答

Getting the subdomain from a URL sounds easy at first.

http://www.domain.example

Scan for the first period then return whatever came after the "http://" ...

Then you remember

http://super.duper.domain.example

Oh. So then you think, okay, find the last period, go back a word and get everything before!

Then you remember

http://super.duper.domain.co.uk

And you're back to square one. Anyone have any great ideas besides storing a list of all TLDs?

标签： url parsing dns subdomain

15条回答

与君花间醉酒

2楼-- · 2019-01-01 07:33

I just wrote a objc library : https://github.com/kejinlu/KKDomain

0人赞添加讨论(0) 举报

听够珍惜

3楼-- · 2019-01-01 07:39

Having taken a quick look at the publicsuffix.org list, it appears that you could make a reasonable approximation by removing the final three segments ("segment" here meaning a section between two dots) from domains where the final segment is two characters long, on the assumption that it's a country code and will be further subdivided. If the final segment is "us" and the second-to-last segment is also two characters, remove the last four segments. In all other cases, remove the final two segments. e.g.:

http://www.domain.example

"example" is not two characters, so remove "domain.example", leaving "www"

http://super.duper.domain.example

"example" is not two characters, so remove "domain.example", leaving "super.duper"

http://super.duper.domain.co.uk

"uk" is two characters (but not "us"), so remove "domain.co.uk", leaving "super.duper"

http://foo.pvt.k12.wy.us

"us" is two characters and is "us", plus "wy" is also two characters, so remove "pvt.k12.wy.us", leaving "foo".

Note that, although this works for all examples that I've seen in the responses so far, it remains only a reasonable approximation. It is not completely correct, although I suspect it's about as close as you're likely to get without making/obtaining an actual list to use for reference.

0人赞添加讨论(0) 举报

不流泪的眼

4楼-- · 2019-01-01 07:40

List of common suffixes (.co.uk, .com, et cetera) to strip out along with the http:// and then you'll only have "sub.domain" to work with instead of "http://sub.domain.suffix", or at least that's what I'd probably do.

The biggest problem is the list of possible suffixes. There's a lot, after all.

0人赞添加讨论(0) 举报

上一页 1 2 3

Get the subdomain from a URL

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间