Parse JSON object in SAS macro

2019-01-15 17:37发布

问题:

Here is the input JSON file. It have to parse in SAS dataset.

"results":
[
 {
    "acct_nbr": 1234,
    "firstName": "John",
    "lastName": "Smith",
    "age": 25,
    "address": {
        "streetAddress": "21 2nd Street",
        "city": "New York",
        "state": "NY",
        "postalCode": "10021"
        }
 }
,
{
    "acct_nbr": 3456,
    "firstName": "Sam",
    "lastName": "Jones",
    "age": 32,
    "address": {
        "streetAddress": "25 2nd Street",
        "city": "New Jersy",
        "state": "NJ",
        "postalCode": "10081"
        }
 }
]

And I want the output for only Address field in SAS dataset like this :

ACCT_NBR    FIELD_NAME  FIELD_VALUE
1234    streetAddress   21 2nd Street
1234    city    New York
1234    state   NY
1234    postalCode  10021
3456    streetAddress   25 2nd Street
3456    city    New Jersy
3456    state   NJ
3456    postalCode  10081

I have tried separate way, but no similar output. even tried scanover from PDF ... but cannot get desired output...

here is my code......and output....

LIBNAME src  '/home/user/read_JSON';

filename data '/home/user/read_JSON/test2.json';
data src.testdata2;
    infile data lrecl = 32000 truncover scanover;
        input @'"streetAddress": "' streetAddress $255. @'"city": "' city $255. @'"state": "' state $2. @'"postalCode": "' postalCode $255.;
        streetAddress = substr(streetAddress,1,index(streetAddress,'",')-2);
        city = substr( city,1,index( city,'",')-2);
        state = substr(state,1,index(state,'",')-2);
        postalCode = substr(postalCode,1,index(postalCode,'",')-2);
run;

proc print data=src.testdata2;
RUN;

My OUTPUT in .lst file

The SAS System   09:44 Tuesday, January 14, 2014   1
           street                            postal
 Obs      Address         city      state     Code

  1     21 2nd Stree    New Yor       NY      10021"
  2     25 2nd Stree    New Jers      NJ      10081"

回答1:

To answer your question with a SAS-only solution, your problems are twofold:

  • Use SCAN instead of substr to get the un-comma/quotationed portion
  • acct_nbr is a number, so you need to remove the final quotation mark from the input.

Here's the correct code (I changed directories, you'll need to change them back):

filename data 'c:\temp\json.txt';
data testdata2;
    infile data lrecl = 32000 truncover scanover;
        input 
            @'"acct_nbr": ' acct_nbr $255.
            @'"streetAddress": "' streetAddress $255. 
            @'"city": "' city $255. 
            @'"state": "' state $2. 
            @'"postalCode": "' postalCode $255.;

        acct_nbr=scan(acct_nbr,1,',"');
        streetAddress = scan(streetAddress,1,',"');
        city = scan(city,1,',"');
        state = scan(state,1,',"');
        postalCode = scan(postalCode,1,',"');
run;

proc print data=testdata2;
RUN;


回答2:

You can use proc groovy to parse JSON pretty easily (assuming you know Groovy). This SAS blog on authenticating to Twitter shows a detailed example of how to do it; here is some of the highlights.

This assumes you have the Groovy JAR files (http://groovy.codehaus.org/Download) and a way to output the files (the example uses OpenCSV).

The below is my attempt at it; I don't think it quite works, but I don't know Groovy, either. The general concept should be correct. If you want to try this approach, but can't figure out the specifics of this, you might either retag your question groovy or ask a new question with that tag.

%let groovydir=C:\Program Files\SASHome_9.4\SASFoundation\9.4\groovy; *the location the groovy JARs are located at;

%let sourcefile=c:\temp\json.txt;
%let outfile=c:\temp\json.csv;

proc groovy classpath="&groovydir.\groovy-all-2.2.0.jar;&groovydir.\opencsv-2.3.jar"; 

   submit "&sourcefile" "&outfile"; 
      import groovy.json.*
      import au.com.bytecode.opencsv.CSVWriter

      def input = new File(args[0]).text
      def output = new JsonSlurper().parseText(input)
      def csvoutput = new FileWriter(args[1])

      CSVWriter writer = new CSVWriter(csvoutput);

      String[] header = new String[8];
      header[0] = "results.acct_nbr";
      header[1] = "results.firstName";
      header[2] = "results.lastName";
      header[3] = "results.age";
      header[4] = "results.address.streetAddress";
      header[5] = "results.address.city";
      header[6] = "results.address.state";
      header[7] = "results.address.postalCode";
      writer.writeNext(header);

      output.statuses.each {
         String[] content = new String[8];
         content[0] = it.results.acct_nbr.toString();
         content[1] = it.results.firstName.toString();
         content[2] = it.results.lastName.toString();
         content[3] = it.results.age.toString();
         content[4] = it.results.address.streetAddress.toString();
         content[5] = it.results.address.city.toString();
         content[6] = it.results.address.state.toString();
         content[7] = it.results.address.postalCode.toString(); 
         writer.writeNext(content)
      }         

      writer.close();

    endsubmit; 
 quit;


回答3:

I used this json file and above code as an example in a thread on sas.com. One of the expert programmers on there was extremely generous and came up with a solution. Note the json file should be wrapped in "{}".

Link: https://communities.sas.com/thread/72163

Code:

filename cp temp;
proc groovy classpath=cp;


add classpath="C:\Program Files\Java\groovy-2.3.4\embeddable\groovy-all-2.3.4.jar";
/*or*/
/*
add classpath="C:\Program Files\Java\groovy-2.3.4\lib\groovy-2.3.4.jar";
add classpath="C:\Program Files\Java\groovy-2.3.4\lib\groovy-json-2.3.4.jar";
*/

submit parseonly;
import groovy.json.JsonSlurper
class MyJsonParser {
    def parseFile(path) {
     def jsonFile = new File(path)
  def jsonText = jsonFile.getText()
        def InputJSON = new JsonSlurper().parseText(jsonText)
        def accounts = []


        InputJSON.results.each{
            accounts << [
                    acct_nbr      : it.acct_nbr.toString(),
                    firstName     : it.firstName,
                    lastName      : it.lastName,
                    age           : it.age.toString(),
                    streetAddress : it.address.streetAddress,
                    city          : it.address.city,
                    state         : it.address.state,
                    postalCode    : it.address.postalCode
            ]
        }


        return accounts
    }
}
endsubmit;


submit parseonly;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.LinkedHashMap;


public class MyJsonParser4Sas {
    public String filename = "";


    public void init() {
        MyJsonParser myParser = new MyJsonParser();
        accounts = myParser.parseFile(filename);
        iter = accounts.iterator();
    }


    public boolean hasNext() {
        return iter.hasNext();
    }


    public void getNext() {
        account = ((LinkedHashMap) (iter.next()));
    }


    public String getString(String k) {
        return account.get(k);
    }


    protected ArrayList accounts;
    protected Iterator iter;
    protected LinkedHashMap account;
}
endsubmit;


quit;


options set=classpath "%sysfunc(pathname(cp,f))";

data accounts;
   attrib id            label="Account Index"  length=    8
          acct_nbr      label="Account Number" length=$  10
          firstName     label="First Name"     length=$  20
          lastName      label="Last Name"      length=$  30
          age           label="Age"            length=$   3
          streetAddress label="Street Address" length=$ 128
          city          label="City"           length=$  40
          state         label="State"          length=$   2
          postalCode    label="Postal Code"    length=$   5;


   dcl javaobj accounts("MyJsonParser4Sas");
   accounts.exceptiondescribe(1);


   accounts.setStringField("filename", "C:\\foo.json");


   accounts.callVoidMethod("init");


   accounts.callBooleanMethod("hasNext",rc);
   do id=1 by 1 while(rc);
      accounts.callVoidMethod("getNext");
   accounts.callStringMethod("getString", "acct_nbr", acct_nbr);
   accounts.callStringMethod("getString", "firstName", firstName);
   accounts.callStringMethod("getString", "lastName", lastName);
   accounts.callStringMethod("getString", "age", age);
   accounts.callStringMethod("getString", "streetAddress", streetAddress);
   accounts.callStringMethod("getString", "city", city);
   accounts.callStringMethod("getString", "state", state);
   accounts.callStringMethod("getString", "postalCode", postalCode);
      output;
   accounts.callBooleanMethod("hasNext",rc);
   end;

   drop rc;
run;