Getting the subdomain from a URL sounds easy at first.
http://www.domain.example
Scan for the first period then return whatever came after the "http://" ...
Then you remember
http://super.duper.domain.example
Oh. So then you think, okay, find the last period, go back a word and get everything before!
Then you remember
http://super.duper.domain.co.uk
And you're back to square one. Anyone have any great ideas besides storing a list of all TLDs?
I just wrote a objc library : https://github.com/kejinlu/KKDomain
Having taken a quick look at the publicsuffix.org list, it appears that you could make a reasonable approximation by removing the final three segments ("segment" here meaning a section between two dots) from domains where the final segment is two characters long, on the assumption that it's a country code and will be further subdivided. If the final segment is "us" and the second-to-last segment is also two characters, remove the last four segments. In all other cases, remove the final two segments. e.g.:
"example" is not two characters, so remove "domain.example", leaving "www"
"example" is not two characters, so remove "domain.example", leaving "super.duper"
"uk" is two characters (but not "us"), so remove "domain.co.uk", leaving "super.duper"
"us" is two characters and is "us", plus "wy" is also two characters, so remove "pvt.k12.wy.us", leaving "foo".
Note that, although this works for all examples that I've seen in the responses so far, it remains only a reasonable approximation. It is not completely correct, although I suspect it's about as close as you're likely to get without making/obtaining an actual list to use for reference.
List of common suffixes (.co.uk, .com, et cetera) to strip out along with the http:// and then you'll only have "sub.domain" to work with instead of "http://sub.domain.suffix", or at least that's what I'd probably do.
The biggest problem is the list of possible suffixes. There's a lot, after all.