A comprehensive regex for phone number validation

2020-01-22 10:31发布

I'm trying to put together a comprehensive regex to validate phone numbers. Ideally it would handle international formats, but it must handle US formats, including the following:

  • 1-234-567-8901
  • 1-234-567-8901 x1234
  • 1-234-567-8901 ext1234
  • 1 (234) 567-8901
  • 1.234.567.8901
  • 1/234/567/8901
  • 12345678901

I'll answer with my current attempt, but I'm hoping somebody has something better and/or more elegant.

30条回答
Root(大扎)
2楼-- · 2020-01-22 11:05
.*

If the users want to give you their phone numbers, then trust them to get it right. If they do not want to give it to you then forcing them to enter a valid number will either send them to a competitor's site or make them enter a random string that fits your regex. I might even be tempted to look up the number of a premium rate horoscope hotline and enter that instead.

I would also consider any of the following as valid entries on a web site:

"123 456 7890 until 6pm, then 098 765 4321"  
"123 456 7890 or try my mobile on 098 765 4321"  
"ex-directory - mind your own business"
查看更多
\"骚年 ilove
3楼-- · 2020-01-22 11:05

I would also suggest looking at the "libphonenumber" Google Library. I know it is not regex but it does exactly what you want.

For example, it will recognize that:

15555555555

is a possible number but not a valid number. It also supports countries outside the US.

Highlights of functionality:

  • Parsing/formatting/validating phone numbers for all countries/regions of the world.
  • getNumberType - gets the type of the number based on the number itself; able to distinguish Fixed-line, Mobile, Toll-free, Premium Rate, Shared Cost, VoIP and Personal Numbers (whenever feasible).
  • isNumberMatch - gets a confidence level on whether two numbers could be the same.
  • getExampleNumber/getExampleNumberByType - provides valid example numbers for all countries/regions, with the option of specifying which type of example phone number is needed.
  • isPossibleNumber - quickly guessing whether a number is a possible phonenumber by using only the length information, much faster than a full validation.
  • isValidNumber - full validation of a phone number for a region using length and prefix information.
  • AsYouTypeFormatter - formats phone numbers on-the-fly when users enter each digit.
  • findNumbers - finds numbers in text input.
  • PhoneNumberOfflineGeocoder - provides geographical information related to a phone number.

Examples

The biggest problem with phone number validation is it is very culturally dependant.

  • America
    • (408) 974–2042 is a valid US number
    • (999) 974–2042 is not a valid US number
  • Australia
    • 0404 999 999 is a valid Australian number
    • (02) 9999 9999 is also a valid Australian number
    • (09) 9999 9999 is not a valid Australian number

A regular expression is fine for checking the format of a phone number, but it's not really going to be able to check the validity of a phone number.

I would suggest skipping a simple regular expression to test your phone number against, and using a library such as Google's libphonenumber (link to GitHub project).

Introducing libphonenumber!

Using one of your more complex examples, 1-234-567-8901 x1234, you get the following data out of libphonenumber (link to online demo):

Validation Results

Result from isPossibleNumber()  true
Result from isValidNumber()     true

Formatting Results:

E164 format                    +12345678901
Original format                (234) 567-8901 ext. 123
National format                (234) 567-8901 ext. 123
International format           +1 234-567-8901 ext. 123
Out-of-country format from US  1 (234) 567-8901 ext. 123
Out-of-country format from CH  00 1 234-567-8901 ext. 123

So not only do you learn if the phone number is valid (which it is), but you also get consistent phone number formatting in your locale.

As a bonus, libphonenumber has a number of datasets to check the validity of phone numbers, as well, so checking a number such as +61299999999 (the international version of (02) 9999 9999) returns as a valid number with formatting:

Validation Results

Result from isPossibleNumber()  true
Result from isValidNumber()     true

Formatting Results

E164 format                    +61299999999
Original format                61 2 9999 9999
National format                (02) 9999 9999
International format           +61 2 9999 9999
Out-of-country format from US  011 61 2 9999 9999
Out-of-country format from CH  00 61 2 9999 9999

libphonenumber also gives you many additional benefits, such as grabbing the location that the phone number is detected as being, and also getting the time zone information from the phone number:

PhoneNumberOfflineGeocoder Results
Location        Australia

PhoneNumberToTimeZonesMapper Results
Time zone(s)    [Australia/Sydney]

But the invalid Australian phone number ((09) 9999 9999) returns that it is not a valid phone number.

Validation Results

Result from isPossibleNumber()  true
Result from isValidNumber()     false

Google's version has code for Java and Javascript, but people have also implemented libraries for other languages that use the Google i18n phone number dataset:

Unless you are certain that you are always going to be accepting numbers from one locale, and they are always going to be in one format, I would heavily suggest not writing your own code for this, and using libphonenumber for validating and displaying phone numbers.

查看更多
forever°为你锁心
4楼-- · 2020-01-22 11:06

My attempt at an unrestrictive regex:

/^[+#*\(\)\[\]]*([0-9][ ext+-pw#*\(\)\[\]]*){6,45}$/

Accepts:

+(01) 123 (456) 789 ext555
123456
*44 123-456-789 [321]
123456
123456789012345678901234567890123456789012345
*****++[](][((( 123456tteexxttppww

Rejects:

mob 07777 777777
1234 567 890 after 5pm
john smith
(empty)
1234567890123456789012345678901234567890123456
911

It is up to you to sanitize it for display. After validating it could be a number though.

查看更多
做个烂人
5楼-- · 2020-01-22 11:06

I work for a market research company and we have to filter these types of input alllll the time. You're complicating it too much. Just strip the non-alphanumeric chars, and see if there's an extension.

For further analysis you can subscribe to one of many providers that will give you access to a database of valid numbers as well as tell you if they're landlines or mobiles, disconnected, etc. It costs money.

查看更多
Summer. ? 凉城
6楼-- · 2020-01-22 11:07

I was struggling with the same issue, trying to make my application future proof, but these guys got me going in the right direction. I'm not actually checking the number itself to see if it works or not, I'm just trying to make sure that a series of numbers was entered that may or may not have an extension.

Worst case scenario if the user had to pull an unformatted number from the XML file, they would still just type the numbers into the phone's numberpad 012345678x5, no real reason to keep it pretty. That kind of RegEx would come out something like this for me:

\d+ ?\w{0,9} ?\d+
  • 01234467 extension 123456
  • 01234567x123456
  • 01234567890
查看更多
地球回转人心会变
7楼-- · 2020-01-22 11:08

I answered this question on another SO question before deciding to also include my answer as an answer on this thread, because no one was addressing how to require/not require items, just handing out regexs: Regex working wrong, matching unexpected things

From my post on that site, I've created a quick guide to assist anyone with making their own regex for their own desired phone number format, which I will caveat (like I did on the other site) that if you are too restrictive, you may not get the desired results, and there is no "one size fits all" solution to accepting all possible phone numbers in the world - only what you decide to accept as your format of choice. Use at your own risk.

Quick cheat sheet

  • Start the expression: /^
  • If you want to require a space, use: [\s] or \s
  • If you want to require parenthesis, use: [(] and [)] . Using \( and \) is ugly and can make things confusing.
  • If you want anything to be optional, put a ? after it
  • If you want a hyphen, just type - or [-] . If you do not put it first or last in a series of other characters, though, you may need to escape it: \-
  • If you want to accept different choices in a slot, put brackets around the options: [-.\s] will require a hyphen, period, or space. A question mark after the last bracket will make all of those optional for that slot.
  • \d{3} : Requires a 3-digit number: 000-999. Shorthand for [0-9][0-9][0-9].
  • [2-9] : Requires a digit 2-9 for that slot.
  • (\+|1\s)? : Accept a "plus" or a 1 and a space (pipe character, |, is "or"), and make it optional. The "plus" sign must be escaped.
  • If you want specific numbers to match a slot, enter them: [246] will require a 2, 4, or 6. [77|78] will require 77 or 78.
  • $/ : End the expression
查看更多
登录 后发表回答