How to split mobile number into country code, area

2019-01-11 17:25发布

How to split mobile number into country code, area code and local number? e.g +919567123456 after split

country code = 91

area code = 9567

local number = 123456

7条回答
三岁会撩人
2楼-- · 2019-01-11 18:03

If you strive accurate UK data see also http://code.google.com/p/ofcom-csverter/ for a complete list of UK area codes with corrections.

查看更多
我命由我不由天
3楼-- · 2019-01-11 18:07

I think you will need something like a dictonary of country and area codes. because booth of them can have a different lenght. USA +1, Germany +49, even +6723. Same with the Areacodes..

查看更多
别忘想泡老子
4楼-- · 2019-01-11 18:08

A very complex problem. First you need to determine the country code. Depending on the country code, the rest has to be splitted into area code and local number. But none of the three parts has a fixed length, not the hole number nor the area code and local part combination!

Example: 4930123456789

  • 49 is the country code of Germany
  • 30 is the area code of Berlin
  • 123456789 is a local number in Berlin (no real one)

Example: 493328123456

  • 49 is the country code of Germany
  • 3328 is the area code of Teltow
  • 123456 is a local number in Teltow (no real one)

Example: 34971123456

  • 34 is the country code of Spain
  • 971 is the area code of Mallorca
  • 123456 is a local number on Mallorca (no real one)
查看更多
forever°为你锁心
5楼-- · 2019-01-11 18:10

Don't maintain your own table of all this data! Use the "Java International Phone Number Utilities library v3.0", https://github.com/googlei18n/libphonenumber. This is what Google uses, and Google maintains it for you!

查看更多
放我归山
6楼-- · 2019-01-11 18:12

As mentioned by various people you can not do this with simple string matching. The lengths of neither country nor area codes are fixed.

Having done this in the past we maintained a table similar in structure to the following :-

+------------+---------+-------+--------------+
|country_code|area_code|country|area          |
+------------+---------+-------+--------------+
|44          |1634     |UK     |Medway        |
|44          |20       |UK     |London        |
|964         |23       |Iraq   |Wasit (Al Kut)|
|964         |2412     |Iraq   |Unreal        |
+------------+---------+-------+--------------+

We then calculated the maximum length of area_code and country_code and checked the string by sub-stringing starting at the maximum length and working our way down until we found a match.

So given the number 441634666788

We would have started at the substring[1,7] (7 being the length of the longest country/area code combination), not found a match, then moved on to [1,6] and found the match for UK/Medway.

Not very efficient but it worked.

EDIT

You could also try something like this but you would need to test it with a full data set or maybe even break it down into separate country and area code selects as it may not be very performant with your chosen DB.

 DECLARE @area_codes TABLE
(
    country_code VARCHAR(10),
    area_code VARCHAR(10),
    country VARCHAR(20),
    area VARCHAR(20),
    match_string VARCHAR(MAX),
    match_length INTEGER
)

INSERT INTO @area_codes VALUES ('44','1382','UK','Dundee', '441382%', 6)
INSERT INTO @area_codes VALUES ('44','1386','UK','Evesham', '441386%', 6)
INSERT INTO @area_codes VALUES ('44', '1', 'UK', 'Geographic numbers', '441%', 3)

DECLARE @number VARCHAR(MAX)
SET @number = '441386111111'

SELECT TOP 1 * 
FROM @area_codes 
WHERE @number LIKE match_string
ORDER BY match_length DESC

You would maintain the match_string and match_length fields through a trigger, taking care to cope with null area codes and index the table on the match_string column.

查看更多
爷、活的狠高调
7楼-- · 2019-01-11 18:17

The answer very much depends on the country. There is no universal rule saying "this is country code, this is area code, this is local number". The only information that can be gained universally is the country number (and even that can be 1-4 digits long); then you need to consult the specific country's ruleset.

For examples (like, "there are many different phone numbers in the given countries, but they all follow the same format"):

  • +420123456789 is a (bogus) number in Czech Republic (country code +420 ), and the rest IS the local number (some countries use an undivided addressing space, although you could infer a few bits of data from the first 1-4 digits of the local number (e.g. "+420800 are toll-free numbers")). So, the only useful way to parse this number is into two parts, +420 123456789.
  • +18005551234 is a (probably also bogus) number in the US; according to the North American numbering plan, +1 is country code, 800 is area code (toll-free numbers), 555 is exchange code and 1234 is local number. You can then parse the number into four parts, +1 800 555 1234.
查看更多
登录 后发表回答