Javascript regex to find a base URL

2020-04-19 05:20发布

I'm going mad with this regex in JS:

var patt1=/^http(s)?:\/\/[a-z0-9-]+(.[a-z0-9-]+)*?(:[0-9]+)?(\/)?$/i;

If I give an input string like "http://www.eitb.com/servicios/concursos/516522/" this regex it's supossed to return NULL, because there are a "folder" after base URL. It works in PHP, but not in Javascript, like in this script:

<script type="text/javascript">
var str="http://www.eitb.com/servicios/concursos/516522/"; 
var patt1=/^http(s)?:\/\/[a-z0-9-]+(.[a-z0-9-]+)*?(:[0-9]+)?(\/)?$/i;
document.write(str.match(patt1));
</script>

It returns

http://www.eitb.com/servicios/concursos/516522/,,/516522,,/ 

The question is: why it is not working? How to make it work?

The idea is to implement this regex in another function to get NULL when the URL passed is not in the correct format:

http://www.eitb.com/ -> Correct http://www.eitb.com/something -> Incorrect

Thanks

2条回答
看我几分像从前
2楼-- · 2020-04-19 05:56

Considering you have a properly formatted URL this simple RegExp should do the trick every time.

var patt1=/^https?:\/\/[^\/]+/i;

Here's the breakdown...

Starting with the first position (denoted by ^)

Look for http

http can be followed by s (denoted by the ? which means 0 or 1 of the character or set before it)

Then look for :// after the http or https (denoted by :\/\/)

Next match any number of characters except for / (denoted by [^\/]+ - the + means 1 or more)

Case insensitive (denoted by i)

NOTE: this will also pick up ports http://example.com:80 - to get rid of the :80 (or a colon followed by any port number) simply add a : to the negated character class [^\/:] for example.

查看更多
老娘就宠你
3楼-- · 2020-04-19 06:04

I'm no javascript pro, but accustomed to perl regexp, so I'll give it a try; the . in the middle of the regexp might need to be escaped, as it can map a / and jinx the whole regexp.

Try this way:

var patt1=/^http(s)?:\/\/[a-z0-9-]+(\.[a-z0-9-]+)*?(:[0-9]+)?(\/)?$/i; 
查看更多
登录 后发表回答