I'm trying to put together a comprehensive regex to validate phone numbers. Ideally it would handle international formats, but it must handle US formats, including the following:
- 1-234-567-8901
- 1-234-567-8901 x1234
- 1-234-567-8901 ext1234
- 1 (234) 567-8901
- 1.234.567.8901
- 1/234/567/8901
- 12345678901
I'll answer with my current attempt, but I'm hoping somebody has something better and/or more elegant.
If the users want to give you their phone numbers, then trust them to get it right. If they do not want to give it to you then forcing them to enter a valid number will either send them to a competitor's site or make them enter a random string that fits your regex. I might even be tempted to look up the number of a premium rate horoscope hotline and enter that instead.
I would also consider any of the following as valid entries on a web site:
I would also suggest looking at the "libphonenumber" Google Library. I know it is not regex but it does exactly what you want.
For example, it will recognize that:
is a possible number but not a valid number. It also supports countries outside the US.
Highlights of functionality:
getNumberType
- gets the type of the number based on the number itself; able to distinguish Fixed-line, Mobile, Toll-free, Premium Rate, Shared Cost, VoIP and Personal Numbers (whenever feasible).isNumberMatch
- gets a confidence level on whether two numbers could be the same.getExampleNumber
/getExampleNumberByType
- provides valid example numbers for all countries/regions, with the option of specifying which type of example phone number is needed.isPossibleNumber
- quickly guessing whether a number is a possible phonenumber by using only the length information, much faster than a full validation.isValidNumber
- full validation of a phone number for a region using length and prefix information.AsYouTypeFormatter
- formats phone numbers on-the-fly when users enter each digit.findNumbers
- finds numbers in text input.PhoneNumberOfflineGeocoder
- provides geographical information related to a phone number.Examples
The biggest problem with phone number validation is it is very culturally dependant.
(408) 974–2042
is a valid US number(999) 974–2042
is not a valid US number0404 999 999
is a valid Australian number(02) 9999 9999
is also a valid Australian number(09) 9999 9999
is not a valid Australian numberA regular expression is fine for checking the format of a phone number, but it's not really going to be able to check the validity of a phone number.
I would suggest skipping a simple regular expression to test your phone number against, and using a library such as Google's
libphonenumber
(link to GitHub project).Introducing libphonenumber!
Using one of your more complex examples,
1-234-567-8901 x1234
, you get the following data out oflibphonenumber
(link to online demo):So not only do you learn if the phone number is valid (which it is), but you also get consistent phone number formatting in your locale.
As a bonus,
libphonenumber
has a number of datasets to check the validity of phone numbers, as well, so checking a number such as+61299999999
(the international version of(02) 9999 9999
) returns as a valid number with formatting:libphonenumber also gives you many additional benefits, such as grabbing the location that the phone number is detected as being, and also getting the time zone information from the phone number:
But the invalid Australian phone number (
(09) 9999 9999
) returns that it is not a valid phone number.Google's version has code for Java and Javascript, but people have also implemented libraries for other languages that use the Google i18n phone number dataset:
Unless you are certain that you are always going to be accepting numbers from one locale, and they are always going to be in one format, I would heavily suggest not writing your own code for this, and using libphonenumber for validating and displaying phone numbers.
My attempt at an unrestrictive regex:
Accepts:
Rejects:
It is up to you to sanitize it for display. After validating it could be a number though.
I work for a market research company and we have to filter these types of input alllll the time. You're complicating it too much. Just strip the non-alphanumeric chars, and see if there's an extension.
For further analysis you can subscribe to one of many providers that will give you access to a database of valid numbers as well as tell you if they're landlines or mobiles, disconnected, etc. It costs money.
I was struggling with the same issue, trying to make my application future proof, but these guys got me going in the right direction. I'm not actually checking the number itself to see if it works or not, I'm just trying to make sure that a series of numbers was entered that may or may not have an extension.
Worst case scenario if the user had to pull an unformatted number from the XML file, they would still just type the numbers into the phone's numberpad
012345678x5
, no real reason to keep it pretty. That kind of RegEx would come out something like this for me:01234467 extension 123456
01234567x123456
01234567890
I answered this question on another SO question before deciding to also include my answer as an answer on this thread, because no one was addressing how to require/not require items, just handing out regexs: Regex working wrong, matching unexpected things
From my post on that site, I've created a quick guide to assist anyone with making their own regex for their own desired phone number format, which I will caveat (like I did on the other site) that if you are too restrictive, you may not get the desired results, and there is no "one size fits all" solution to accepting all possible phone numbers in the world - only what you decide to accept as your format of choice. Use at your own risk.
Quick cheat sheet
/^
[\s]
or\s
[(]
and[)]
. Using\(
and\)
is ugly and can make things confusing.?
after it-
or[-]
. If you do not put it first or last in a series of other characters, though, you may need to escape it:\-
[-.\s]
will require a hyphen, period, or space. A question mark after the last bracket will make all of those optional for that slot.\d{3}
: Requires a 3-digit number: 000-999. Shorthand for[0-9][0-9][0-9]
.[2-9]
: Requires a digit 2-9 for that slot.(\+|1\s)?
: Accept a "plus" or a 1 and a space (pipe character,|
, is "or"), and make it optional. The "plus" sign must be escaped.[246]
will require a 2, 4, or 6.[77|78]
will require 77 or 78.$/
: End the expression