PHP Url Validation Error: http://https://example.c

2019-07-10 01:30发布

I had this url regex pattern in place:

$pattern = "@\b(https?://[^\s()<>\[\]\{\}]{1,".$max_length_allowed_for_each_url."}(?:\([\w\d]+\)|([^[:punct:]\s]|/)))@";

It seemed to work pretty well at validating any URL I threw at it, until I realized that https://http://google.com (apparently even stackoverflow is considering that a valid URL (it made that URL clickable, not me, although it did remove one of the colons) so perhaps I am out of luck?) was a valid URL, when it certainly is not.

I did a little research... and learnt that I should be using filter_var instead of a regex for PHP URL validation anyways... and was disappointed to realize that it too is susceptible to this very same validation problem.

I could easily conquer it with:

str_replace(array("https://http://","http://https://"), array("http://","https://"), $url);

But... that just seems so wrong.

1条回答
SAY GOODBYE
2楼-- · 2019-07-10 02:28

Well, it is a valid URI. Technically. Look at the RFC for URIs if you don't believe me.

  • The path component of a URI can contain //.
  • http is a valid host name.
  • The port is allowed to be missing even if the : is present (it's specified as *digit, not 1*digit). (This is why Stack Overflow removed the colon -- it thought you were using the default port, so it removed it from the URI.)

I suggest writing a special case for this. In a separate step, check to see if the URI starts with https?://https?://, and fix it.

查看更多
登录 后发表回答