POSITION()-function in PATH-element of XML-map wor

2019-07-17 06:05发布

I'm trying to import xml-file to SAS. Xml-file is a Google "georesponse" on address request. Here's its fragment:

<address_component>
  <long_name>1025</long_name>
  <short_name>1025</short_name>
  <type>street_number</type>
</address_component>
<address_component>
  <long_name>Gilford Street</long_name>
  <short_name>Gilford St</short_name>
  <type>route</type>
 </address_component>
 <address_component>
  <long_name>West End</long_name>
  <short_name>West End</short_name>
  <type>neighborhood</type>
  <type>political</type>
</address_component>
<address_component>
  <long_name>Vancouver</long_name>
  <short_name>Vancouver</short_name>
  <type>locality</type>
  <type>political</type>
</address_component>

You can get the full xml-file, entering into browser the following url:http://maps.googleapis.com/maps/api/geocode/xml?address=1025,+Gilford+Street,+Vancouver&sensor=false

I want to convert it into SAS-dataset like this:

type              long_name

street_number      1025
route              Gilford St
neighborhood       West End

etc

As you can see, some (address_component)-elements have inside only one (type)-element (like street_number or route), but others have two: first one with value of interest (e.g. 'neighborhood') and the second - with value 'political', which I don't need. So I created XML-map in XML Mapper, using function POSITION()=1, to ensure that only first occurence of (type)-tag used:

<NAMESPACES count="0"/>

<!-- ############################################################ -->
<TABLE name="GeoResponse">
    <TABLE-PATH syntax="XPath">/GeocodeResponse/result/address_component</TABLE-PATH>

    <COLUMN name="type">
        <PATH syntax="XPath">/GeocodeResponse/result/address_component/type[position()=1]</PATH>
        <TYPE>character</TYPE>
        <DATATYPE>string</DATATYPE>
        <LENGTH>27</LENGTH>
    </COLUMN>

    <COLUMN name="long_name">
        <PATH syntax="XPath">/GeocodeResponse/result/address_component/long_name</PATH>
        <TYPE>character</TYPE>
        <DATATYPE>string</DATATYPE>
        <LENGTH>17</LENGTH>
    </COLUMN>

</TABLE>

And it works properly in XML Mapper itself (in the tab Table View). But when I run the code using this map in SAS EG or SAS Base, column 'type' is empty. If I don't use POSITION()=1 in the map, then everything works well (but for all items except strett_number, route and postal_code I've got 'political' as type, not 'city', 'country' etc).

Does anybody have any clue where can a problem be?

标签: xml sas
2条回答
▲ chillily
2楼-- · 2019-07-17 06:30

Here's the answer from SAS supprot about this problem:

The problem with the POSITION{} is a defect with the XMLV2 engine. Development is aware of this problem and we hope to get this fixed in an upcoming release. This should work fine for the XML engine if you can use this as a workaround.

Here's the note in SAS Knowledge Base: http://support.sas.com/kb/46/769.html

The suggested workaround (using XML engine instead of XMLV2) works, but you have to correct manually version number in XMLMap-file from 2.1 to 1.2 or 1.1 (XML-engine doesn't work with later ones).

查看更多
淡お忘
3楼-- · 2019-07-17 06:54

Running on SAS 9.3. There are differences between versions' XML handling.

I could not get a basic map to do what you are looking for. The position()=1 definitely looks like it should do what you want.

So I wrote a little data step to filter out the "political" line.

I got this to work:

filename in url 'http://maps.googleapis.com/maps/api/geocode/xml?address=1025,+Gilford+Street,+Vancouver&sensor=false';

filename SXLEMAP "c:\temp\google.map";
data _null_;
file SXLEMAP;
put '<SXLEMAP name="SXLEMAP" version="2.1">';
put '<NAMESPACES count="0"/>';

put '<TABLE name="GeoResponse">';
put '    <TABLE-PATH syntax="XPath">/GeocodeResponse/result/address_component</TABLE-PATH>';

put '   <COLUMN name="type">';
*put '       <PATH syntax="XPath">/GeocodeResponse/result/address_component/type[position()=1]</PATH>';
put '        <PATH syntax="XPath">/GeocodeResponse/result/address_component/type</PATH>';
put '        <TYPE>character</TYPE>';
put '        <DATATYPE>string</DATATYPE>';
put '        <LENGTH>27</LENGTH>';
put '    </COLUMN>';

put '    <COLUMN name="long_name">';
put '        <PATH syntax="XPath">/GeocodeResponse/result/address_component/long_name</PATH>';
put '        <TYPE>character</TYPE>';
put '        <DATATYPE>string</DATATYPE>';
put '        <LENGTH>17</LENGTH>';
put '    </COLUMN>';

put '</TABLE>';
put '</SXLEMAP>';
run;

filename  google 'c:\temp\google.xml';
data _null_;
file google;
infile in;
input;
if ^index(_infile_,"political") then
   put _infile_;
run;

libname   google xmlv2 xmlmap=SXLEMAP access=READONLY;

proc print data=google.georesponse;
run;

Produces this:

             Obs    type                           long_name

               1    street_number                  1025
               2    route                          Gilford Street
               3    neighborhood                   West End
               4    locality                       Vancouver
               5    administrative_area_level_2    Greater Vancouver
               6    administrative_area_level_1    British Columbia
               7    country                        Canada
               8    postal_code                    V6G 1R2
查看更多
登录 后发表回答