regex street address match

2019-01-07 21:21发布

While I know that matching a street address will never be perfect I'm looking to create a couple of regex statements that will get close most of the time.

I'm trying to highlight an address. I sucks at regex and I've tried to get close but could someone help me understand how I can make this better?

string:

6 am - 11 pM , Palma Sola Elementary, 6806 Fifth Ave NW, Bradenton, FL 34209 Come find just near the dsfsd sa fsa fasdf asfsds 5001 west your momma doesn't live here my 2005 ford ranger,

Regex 1:

/\s+(\d{2,5}\s+)(?![a|p]m\b)(([a-zA-Z|\s+]{1,5}){1,2})?([\s|\,|.]+)?(([a-zA-Z|\s+]{1,30}){1,4})(court|ct|street|st|drive|dr|lane|ln|road|rd|blvd)([\s|\,|.|\;]+)?(([a-zA-Z|\s+]{1,30}){1,2})([\s|\,|.]+)?\b(AK|AL|AR|AZ|CA|CO|CT|DC|DE|FL|GA|GU|HI|IA|ID|IL|IN|KS|KY|LA|MA|MD|ME|MI|MN|MO|MS|MT|NC|ND|NE|NH|NJ|NM|NV|NY|OH|OK|OR|PA|RI|SC|SD|TN|TX|UT|VA|VI|VT|WA|WI|WV|WY)([\s|\,|.]+)?(\s+\d{5})?([\s|\,|.]+)/i

(Sometimes there's just a street and city, but no state or zip)

regex 2:

/\b(\d{2,5}\s+)(?![a|p]m\b)(NW|NE|SW|SE|north|south|west|east|n|e|s|w)?([\s|\,|.]+)?(([a-zA-Z|\s+]{1,30}){1,4})(court|ct|street|st|drive|dr|lane|ln|road|rd|blvd)/i

Fiddle with it: http://jsfiddle.net/isuelt/rMC6P/11/

4条回答
相关推荐>>
2楼-- · 2019-01-07 21:41

I needed to do something similar for addresses like

800 SE 20 AVENUE #603, DEERFIELD BEACH

9801 NW 3 STREET APT 5, PLANTATION

11909 GLENMORE DRIVE #4-1, CORAL SPRINGS

This is the regex that I used

\s*([0-9]*)\s((NW|SW|SE|NE|S|N|E|W))?(.*)((NW|SW|SE|NE|S|N|E|W))?((#|APT|BSMT|BLDG|DEPT|FL|FRNT|HNGR|KEY|LBBY|LOT|LOWR|OFC|PH|PIER|REAR|RM|SIDE|SLIP|SPC|STOP|STE|TRLR|UNIT|UPPR|\,)[^,]*)(\,)([\s\w]*)\n

It returns separate groups for each part of the address (I did not need to parse state name for my case). Try it out here https://regex101.com/r/OsvOxn/3

查看更多
做自己的国王
3楼-- · 2019-01-07 21:49

Matt is right. Regex parsing is never going to be very accurate. You'll inevitably have a reasonable number of false positives and false negatives if you go down this dangerous road. However, if you're okay with that, I actually like to use a combination of two regexes - one for street named based schemes and one for city grid schemes:

Street Name System:

/\b\d{1,6} +.{2,25}\b(avenue|ave|court|ct|street|st|drive|dr|lane|ln|road|rd|blvd|plaza|parkway|pkwy)[.,]?(.{0,25} +\b\d{5}\b)?/ig

Grid System

/(\b( +)?\d{1,6} +(north|east|south|west|n|e|s|w)[,.]?){2}(.{0,25} +\b\d{5}\b)?\b/ig

Also note that if the address doesn't have a state and zipcode, you can basically forget about extracting any text that goes after the street moniker.

查看更多
我只想做你的唯一
4楼-- · 2019-01-07 21:52

US addresses are not a regular language, and cannot be matched by using regular expressions. They are helpful in some isolated cases, but in general, they will fail you, especially for input like that.

I used to work at an address verification company. In answer to your question, to "highlight an address" in a string of text, I recommend you try an extraction utility. There are a few out there and I suggest you look around, but here is ours using the input from your question --- as you can see, it found the address and validated it:

LiveAddress extraction example

The API endpoint returns JSON which contains the start and end positions of each address, as well as plenty of information about each one. (See the CSV output at the bottom of the picture above.)

I commend you for braving those regular expressions you tried! Hopefully this is helpful.

查看更多
Root(大扎)
5楼-- · 2019-01-07 21:57

This works for me!

if(address.match(/^\s*\S+(?:\s+\S+){2}/)) {
   console.log('good address!')
}
查看更多
登录 后发表回答