I've seen some posts like this, but not exactly what I want to do.
How can I extract and delete URL links, and then remove them from plain text.
Example:
"Hello!!, I love http://www.google.es".
I want extract the "http://www.google.es", save it on a variable, and then, remove it from my text.
Finally, the text has to be like that:
"Hello!!, I love".
The URLs usually are the last "word" of the text, but not always.
Perhaps you want URI::Find, which can find URIs in arbitrary text. The return value from the code reference you give it produces the replacement string for the URL, so you can just return the empty string if you merely want to get rid of the URIs:
use URI::Find;
my $string = do { local $/; <DATA> };
my $finder = URI::Find->new( sub { '' } );
$finder->find(\$string );
print $string;
__END__
This has a mailto:joe@example.com
Go to http://www.google.com
Pay at https://paypal.com
From ftp://ftp.cpan.org download a file
This works for me for 99% of the cases, sure there are edge cases, but for my needs it's good enough:
/((?<=[^a-zA-Z0-9])(?:https?\:\/\/|[a-zA-Z0-9]{1,}\.{1}|\b)(?:\w{1,}\.{1}){1,5}(?:com|org|edu|gov|uk|net|ca|de|jp|fr|au|us|ru|ch|it|nl|se|no|es|mil|iq|io|ac|ly|sm){1}(?:\/[a-zA-Z0-9]{1,})*)/mg
https://regex101.com/r/fO6mX3/2
If Perl is not a must
$ cat file
"Hello!!, I love http://www.google.es".
this is another link http://www.somewhere.com
this if ftp link ftp://www.anywhere.com the end
$ awk '{gsub(/(http|ftp):\/\/.[^" ]*/,"") }1' file
"Hello!!, I love ".
this is another link
this if ftp link the end
Of course, you can also adapt the regex to Perl if you like