How can I read in data with uneven spacings?

2019-08-02 09:47发布

I have data which doesn't appear to have consistent spacings or positioning. It looks like:

1675 C Street , Suite 201   
Anchorage   AK  99501   
61.205475   -149.886882
600 Azalea Road 
Mobile    AL      36609 
30.656824   -88.148781
1601 Harbor Bay Parkway , Suite 150 
Alameda   CA    94502   
37.726114   -122.240546
1900 Point West Way,  Suite 270 
Sacramento      CA  95815   
38.5994175  -121.4315844
3600 Wilshire Blvd., Suite 1500 
Los Angeles   CA    90010   
34.06153    -118.303463

From this I'd like to extract the street address, city name, state, zip code, lat, and long. I thought the following code would work, but it produces very weird results.

data voa;
    input Address $50.;
    input City $ State $ Zip;
    input Latitude Longitude;
    datalines;

I think the issue comes from the fact that there isn't consistent spacing or positioning of the elements.

标签: sas
2条回答
再贱就再见
2楼-- · 2019-08-02 10:10

In the absence of consistent delimiters or fixed width fields, this is easier to do using scan:

data want;
infile cards truncover;
  length STATE $2 CITY $32;
  input Address $50.;
  input;
  ZIP = input(scan(_INFILE_, -1),5.);
  STATE = scan(_INFILE_, -2);
  CITY = trim(substr(_INFILE_,1,index(_INFILE_,STATE) - 1));
  input Latitude Longitude;
cards;
1675 C Street , Suite 201   
Anchorage   AK  99501   
61.205475   -149.886882
600 Azalea Road 
Mobile    AL      36609 
30.656824   -88.148781
1601 Harbor Bay Parkway , Suite 150 
Alameda   CA    94502   
37.726114   -122.240546
1900 Point West Way,  Suite 270 
Sacramento      CA  95815   
38.5994175  -121.4315844
3600 Wilshire Blvd., Suite 1500 
Los Angeles   CA    90010   
34.06153    -118.303463
;
run;
查看更多
三岁会撩人
3楼-- · 2019-08-02 10:24

Your data will work fine using LIST input you just need to add the "look for double delimiter option" & to CITY plus it need to be a bit longer $16 or so.

input City &$16. State $ Zip;

enter image description here

查看更多
登录 后发表回答