可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I need to trim URL's in Microsoft Excel to the root domain and to the subdomain.
A1 = Contains https://blog.example.com/page/
B1 = Should result in example.com
C1 = Should result in blog.example.com
Two formulas removing http, https, .www and PATH. The first version (B1) should also remove SUBDOMAIN.
I only have one formula right now:
=MID(SUBSTITUTE(A2;"www.";"");SEARCH(":";A2)+3;SEARCH("/";SUBSTITUTE(A2;"www.";"");9)-SEARCH(":";A2)-3)
https://example.com/page/page
results in example.com
http://www.example.com/page/page
results in example.com
http://blog.example.com/page/
results in blog.example.com
example.com/page
results in #VALUE!
www.example.com/page
results in #VALUE!
As you can see in the sample above I get good results. But it dosen't work without http or https. Also this version keeps subdomain.
回答1:
Try this in B1,
=SUBSTITUTE(TRIM(RIGHT(SUBSTITUTE(REPLACE(REPLACE(A1, 1, IFERROR(FIND("//", A1)+1, 0), TEXT(,))&"/", FIND("/", REPLACE(A1, 1, IFERROR(FIND("//", A1)+1, 0), TEXT(,))&"/"), LEN(A1), TEXT(,)), CHAR(46), REPT(CHAR(32), LEN(A1))), LEN(A1)*2)), CHAR(32), CHAR(46))
.... and this in C1,
=SUBSTITUTE(REPLACE(REPLACE(A1, 1, IFERROR(FIND("//", A1)+1, 0), TEXT(,))&"/", FIND("/", REPLACE(A1, 1, IFERROR(FIND("//", A1)+1, 0), TEXT(,))&"/"), LEN(A1), TEXT(,)), "www.", TEXT(,))
回答2:
Subdomain - it's Jeeped's answer, but I've added support for blank cells, because the original version outputted "/":
=IF(ISBLANK(A1), "", SUBSTITUTE(REPLACE(REPLACE(A1, 1, IFERROR(FIND("//", A1)+1, 0), TEXT(,))&"/", FIND("/", REPLACE(A1, 1, IFERROR(FIND("//", A1)+1, 0), TEXT(,))&"/"), LEN(A1), TEXT(,)), "www.", TEXT(,)))
Domain - a version from MrExcel that supports international domains (e.g. this.co.uk). But unlike Jeeped's version it doesn't support 1 word TLDs like www.this.co or test.this.co - does anyone know how to fix this? For now I use a helper row at least for "www":
=IF(LEFT(a1,LEN("www."))="www.",RIGHT(a1,LEN(a1)-LEN("www.")), a1)
=SUBSTITUTE(TRIM(RIGHT(SUBSTITUTE(TRIM(TRIM(LEFT(SUBSTITUTE(TRIM(IFERROR(MID(b1,FIND("://",b1)+3,99),b1))&"/","/",REPT(" ",99)),99))),".",REPT(" ",99)),99*(2+(LEN(TRIM(RIGHT(SUBSTITUTE(TRIM(TRIM(LEFT(SUBSTITUTE(TRIM(IFERROR(MID(b1,FIND("://",b1)+3,99),b1))&"/","/",REPT(" ",99)),99)))&".",".",REPT(" ",99)),198)))=2))))," ",".")
It worked on:
A | B | C
(blank) | "" | ""
blog.test.com | blog.test.com | test.com
http://blog.test.com | blog.test.com | test.com
test.com | test.com | test.com
http://test.com | test.com | test.com
https://test.com | test.com | test.com
www.test.com | test.com | test.com
http://www.test.com | test.com | test.com
https://www.test.com | test.com | test.com
test.co.uk | test.co.uk | test.co.uk
http://test.co.uk | test.co.uk | test.co.uk
https://test.co.uk | test.co.uk | test.co.uk
www.test.co.uk | test.co.uk | test.co.uk
http://www.test.co.uk | test.co.uk | test.co.uk
https://www.test.co.uk | test.co.uk | test.co.uk
example.test.co.uk | example.test.co.uk | test.co.uk
http://example.test.co.uk | example.test.co.uk | test.co.uk
https://example.test.co.uk | example.test.co.uk | test.co.uk
example.com/test | example.com | example.com
http://example.com/test | example.com | example.com
https://example.com/test | example.com | example.com
http://blog.example.com/page/ | blog.example.com | example.com
example.com/page | example.com | example.com
www.example.com/page | example.com | example.com
回答3:
If your version of excel has FILTERXML function (which can be found in Excel 365, Excel 2019, Excel 2016, and Excel 2013
),
Suppose your URLs are in range A2:A29
To find Sub Domain, enter the following formula in Cell B2
and drag it down:
=SUBSTITUTE(FILTERXML("<t><s>"&SUBSTITUTE(IFERROR(MID(A2,FIND("//",A2)+2,LEN(A2)),A2),"/","</s><s>")&"</s></t>","t/s[1]"),"www.","")
For the logic behind this formula you may give a read to this article: Extract Words with FILTERXML.
To find Root Domain, enter the following formula in Cell C2
and drag it down:
=IF((SUMPRODUCT(--(MID(B2,ROW($1:$100),1)="."))-IF(SUMPRODUCT(--(MID(RIGHT(B2,8),ROW($1:$8),1)="."))=3,2,SUMPRODUCT(--(MID(RIGHT(B2,8),ROW($1:$8),1)="."))))>0,RIGHT(B2,LEN(B2)-FIND(".",B2)),B2)
I used the Sub Domain from the first formula to find Root Domain. The trick is to find out if the component of the URL before the first dot .
is the root domain or sub domain, and take action accordingly.
Sample Data
| URL | Sub | Root |
|----------------------------------|---------------------|----------------|
| https://example.com/page/page | example.com | example.com |
| http://www.example.com/page/page | example.com | example.com |
| http://blog.example.com/page/ | blog.example.com | example.com |
| example.com/page | example.com | example.com |
| www.example.com/page | example.com | example.com |
| blog.test.com | blog.test.com | test.com |
| http://blog.test.com | blog.test.com | test.com |
| test.com | test.com | test.com |
| http://blog.test.uk.net/ | blog.test.uk.net | test.uk.net |
| https://test.cn | test.cn | test.cn |
| www.test.com | test.com | test.com |
| http://www.test.com | test.com | test.com |
| https://www.test.com | test.com | test.com |
| test.co.uk | test.co.uk | test.co.uk |
| https://test.co.uk | test.co.uk | test.co.uk |
| www.test.co.uk | test.co.uk | test.co.uk |
| http://www.test.co.uk | test.co.uk | test.co.uk |
| https://www.test.co.uk | test.co.uk | test.co.uk |
| blog.123.firm.in | blog.123.firm.in | 123.firm.in |
| http://example.test.co.uk | example.test.co.uk | test.co.uk |
| https://test.7.org.au | test.7.org.au | 7.org.au |
| test.example.org.nz/page | test.example.org.nz | example.org.nz |
| http://example.com/test | example.com | example.com |
| https://example.com/test | example.com | example.com |
| http://blog.example.com/page/ | blog.example.com | example.com |
| example.com/page | example.com | example.com |
| www.example.com/page | example.com | example.com |
| http://blog.1.co.uk | blog.1.co.uk | 1.co.uk |