Extract domain names from a file in Shell [closed]

2019-01-29 13:29发布

问题:

I have a file and it's content is a list of some URLs,
I want to extract the domain names from this list of URLs in bash
Example:

sub1.domain.com
domain3.com
sub5.domain.ext
subof.subdomain.domainx.ex2

I want to extract just domain names from this list
How can I do this?
Thank you

回答1:

You can use grep:

grep -Eo '[^.]+\.[^.]+$' file.txt

Example:

$ cat file.txt
sub1.domain.com
sub2.domains2.com
domain3.com
sub5.domain.ext
subof.subdomain.domainx.ex2

$ grep -Eo '[^.]+\.[^.]+$' file.txt
domain.com
domains2.com
domain3.com
domain.ext
domainx.ex2

Note that this will return co.uk for www.google.co.uk.



回答2:

A possible solution using Perl:

use Domain::PublicSuffix qw( );

my $dps = Domain::PublicSuffix->new();

for my $host (qw(
   www.google.com
   foo.bar.google.com
   www.google.co.uk
   foo.bar.google.co.uk
)) {
   my $root = $dps->get_root_domain($host)
      or die $dps->error();

   say $root;
}

Output:

google.com
google.com
google.co.uk
google.co.uk