regex any urls or links in php

2019-02-21 08:35发布

问题:

I have that regex to catch any url in php :

((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=\+\$,\w]+@)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\+\$,\w]+@)[A-Za-z0-9.-]+)((?:\/[\+~%\/.\w-_]*)?\??(?:[-\+=&;%@.\w_]*)#?(?:[\w]*))?)

but it didn't catch the full url ..... for Example

here:

http://v12.lscache6.c.youtube.com/videoplayback?app=youtube_gdata&devkey=AX8iKz393pCCMUL6wqrPOZoO88HsQjpE1a8d1GxQnGDm&el=videos&upn=0K3DA3wYhjI&uaopt=no-save&source=youtube&itag=18&id=ab59b1e9554eca6d&ip=0.0.0.0&ipbits=0&expire=1339342342&sparams=id,itag,source,uaopt,upn,ip,ipbits,expire&signature=5026BE137B41D5CD9785E752D1892903D432974C.BA1D4E0C138210B2275391A2A3D469E582183621&key=yta1&cms_redirect=yes

it only caught:

http://v12.lscache6.c.youtube.com/videoplayback?app=youtube_gdata&devkey=AX8iKz393pCCMUL6wqrPOZoO88HsQjpE1a8d1GxQnGDm&el=videos&upn=0K3DA3wYhjI&uaopt=no-save&source=youtube&itag=18&id=ab59b1e9554eca6d&ip=0.0.0.0&ipbits=0&expire=1339342342&sparams=id

So what do I need to catch the full url?

回答1:

You need to add a coma character in regex:

Your regex fixed:

((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=\+\$,\w]+@)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\+\$,\w]+@)[A-Za-z0-9.-]+)((?:\/[\+~%\/.\w-_]*)?\??(?:[-\+=&;%@,.\w_]*)#?(?:[\w]*)?))

Good site to validate regex: Rubular

If you want decompose the URL into a parts, you can use the parse_url() PHP function



回答2:

May try this one:

\b(?:(?:https?|ftp|file)://|www\.|ftp\.)[-A-Z0-9+&@#/%=~_|$?!:,.]*[A-Z0-9+&@#/%=~_|$]

Explanation

<!--
\b(?:(?:https?|ftp|file)://|www\.|ftp\.)[-A-Z0-9+&@#/%=~_|$?!:,.]*[A-Z0-9+&@#/%=~_|$]

Options: case insensitive

Assert position at a word boundary «\b»
Match the regular expression below «(?:(?:https?|ftp|file)://|www\.|ftp\.)»
   Match either the regular expression below (attempting the next alternative only if this one fails) «(?:https?|ftp|file)://»
      Match the regular expression below «(?:https?|ftp|file)»
         Match either the regular expression below (attempting the next alternative only if this one fails) «https?»
            Match the characters “http” literally «http»
            Match the character “s” literally «s?»
               Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
         Or match regular expression number 2 below (attempting the next alternative only if this one fails) «ftp»
            Match the characters “ftp” literally «ftp»
         Or match regular expression number 3 below (the entire group fails if this one fails to match) «file»
            Match the characters “file” literally «file»
      Match the characters “://” literally «://»
   Or match regular expression number 2 below (attempting the next alternative only if this one fails) «www\.»
      Match the characters “www” literally «www»
      Match the character “.” literally «\.»
   Or match regular expression number 3 below (the entire group fails if this one fails to match) «ftp\.»
      Match the characters “ftp” literally «ftp»
      Match the character “.” literally «\.»
Match a single character present in the list below «[-A-Z0-9+&@#/%=~_|$?!:,.]*»
   Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
   The character “-” «-»
   A character in the range between “A” and “Z” «A-Z»
   A character in the range between “0” and “9” «0-9»
   One of the characters “+&@#/%=~_|$?!:,.” «+&@#/%=~_|$?!:,.»
Match a single character present in the list below «[A-Z0-9+&@#/%=~_|$]»
   A character in the range between “A” and “Z” «A-Z»
   A character in the range between “0” and “9” «0-9»
   One of the characters “+&@#/%=~_|$” «+&@#/%=~_|$»
-->