OSM Data parsing to get the nodes with child

2019-02-15 16:59发布

问题:

I download Open Street Map data for a small region, I want to filter the data to get the nodes with special category.

Here is sample of the OSM data

 <node id="505126369" lat="31.2933856" lon="34.2687443" user="JumpStart International" uid="125156" visible="true" version="1" changeset="2568758" timestamp="2009-09-22T13:05:10Z"/>
 <node id="505126372" lat="31.2682934" lon="34.2745680" user="JumpStart International" uid="125156" visible="true" version="1" changeset="2568758" timestamp="2009-09-22T13:05:10Z"/>
 <node id="505126375" lat="31.2953082" lon="34.3471630" user="JumpStart International" uid="125156" visible="true" version="1" changeset="2568758" timestamp="2009-09-22T13:05:10Z"/>
 <node id="505126378" lat="31.2807872" lon="34.2757999" user="JumpStart International" uid="125156" visible="true" version="1" changeset="2568758" timestamp="2009-09-22T13:05:11Z">
   <tag k="amenity" v="school"/>
   <tag k="name" v="Al Aqqad Basic &amp; Secondary Female School"/>
   <tag k="name:ar" v="مدرسة العقاد الأساسية والثانوية للبنات"/>
  </node>

I want to get the whole schools, hospitals in the data.

If anybody had made the XML parsing with PHP or Java, I would highly appreciate sharing it with me and all interests.

Edit Here is a simple start I have just

$dataFile = base_url() . 'media/files/osmdata/map_3.xml';
    //echo ($dataFile);

    $xml = simplexml_load_file($dataFile);

    //    $countTotal = count($xml->node);
    //   echo 'here'.$countTotal;
    foreach ($xml as $key => $val) {
        var_dump($val);
               // can't manage things overs here

    }

回答1:

The following is a little OSM Overpass API example with PHP SimpleXML I've compiled because we do not have it here for PHP and I love OSM, so let's show some useful examples.

The first part shows how you can query an Overpass Endpoint with standard PHP. You do not need that part because you have already saved the data on your harddisk:

<?php
/**
 * OSM Overpass API with PHP SimpleXML / XPath
 *
 * PHP Version: 5.4 - Can be back-ported to 5.3 by using 5.3 Array-Syntax (not PHP 5.4's square brackets)
 */


//
// 1.) Query an OSM Overpass API Endpoint
//

$query = 'node
  ["amenity"~".*"]
  (38.415938460513274,16.06338500976562,39.52205163048525,17.51220703125);
out;';

$context = stream_context_create(['http' => [
    'method'  => 'POST',
    'header' => ['Content-Type: application/x-www-form-urlencoded'],
    'content' => 'data=' . urlencode($query),
]]);

# please do not stress this service, this example is for demonstration purposes only.
$endpoint = 'http://overpass-api.de/api/interpreter';
libxml_set_streams_context($context);
$start = microtime(true);

$result = simplexml_load_file($endpoint);
printf("Query returned %2\$d node(s) and took %1\$.5f seconds.\n\n", microtime(true) - $start, count($result->node));

For you the second part is more interesting. That is querying the XML data you have already. This is most easily done with xpath, the used PHP XML library is based on libxml which supports XPath 1.0 which covers the various querying needs very well.

The following example lists all schools and tries to obtain their names as well. I have not covered translations yet because my sample data didn't have those, but you can also look for all kind of names including translations and just prefer a specific one):

//
// 2.) Work with the XML Result
//

# get all school nodes with xpath
$xpath = '//node[tag[@k = "amenity" and @v = "school"]]';
$schools = $result->xpath($xpath);
printf("%d School(s) found:\n", count($schools));
foreach ($schools as $index => $school)
{
    # Get the name of the school (if any), again with xpath
    list($name) = $school->xpath('tag[@k = "name"]/@v') + ['(unnamed)'];
    printf("#%02d: ID:%' -10s  [%s,%s]  %s\n", $index, $school['id'], $school['lat'], $school['lon'], $name);
}

The key point here are the xpath queries. Two are used, the first one to get the nodes that have certain tags. I think this is the most interesting one for you:

//node[tag[@k = "amenity" and @v = "school"]]

This line says: Give me all node elements that have a tag element inside which has the k attribute value "amenity" and the v attribute value "school". This is the condition you have to filter out those nodes that are tagged with amenity school.

Further on xpath is used again, now relative to those school nodes to see if there is a name and if so to fetch it:

tag[@k = "name"]/@v'

This line says: Relative to the current node, give me the v attribute from a tag element that as the k attribute value "name". As you can see, some parts are again similar to the line before. I think you can both adopt them to your needs.

Because not all school nodes have a name, a default string is provided for display purposes by adding it to the (then empty) result array:

list($name) = $school->xpath('tag[@k = "name"]/@v') + ['(unnamed)'];
                                                    ^^^^^^^^^^^^^^^
                                                Provide Default Value

So here my results for that code-example:

Query returned 907 node(s) and took 1.10735 seconds.
10 School(s) found:
#00: ID:332534486   [39.5017565,16.2721899]  Scuola Primaria
#01: ID:1428094278  [39.3320912,16.1862820]  (unnamed)
#02: ID:1822746784  [38.9075566,16.5776597]  (unnamed)
#03: ID:1822755951  [38.9120272,16.5713431]  (unnamed)
#04: ID:1903859699  [38.6830409,16.5522243]  Liceo Scientifico Statale A. Guarasci
#05: ID:2002566438  [39.1347698,16.0736924]  (unnamed)
#06: ID:2056891127  [39.4106679,16.8254844]  (unnamed)
#07: ID:2056892999  [39.4124687,16.8286119]  (unnamed)
#08: ID:2272010226  [39.4481717,16.2894353]  SCUOLA DELL'INFANZIA SAN FRANCESCO
#09: ID:2272017152  [39.4502366,16.2807664]  SCUOLA MEDIA 

I hope this is useful already, let me know if you have more clarification questions.


(by rbwilkinson): This is how you could add additional parameters to find other values. the following example finds other properties within one kilometer:

$query = 'node
  ["addr:postcode"~"RM12"]
  (51.5557914,0.2118915,51.5673083,0.2369398);
   node
  (around:1000)
  ["amenity"~"fast_food"];
           out;';

$context = stream_context_create(['http' => [
    'method'  => 'POST',
    'header' => ['Content-Type: application/x-www-form-urlencoded'],
    'content' => 'data=' . urlencode($query),
]]);

$endpoint = 'http://overpass-api.de/api/interpreter';
libxml_set_streams_context($context);

$result = simplexml_load_file($endpoint);
printf("Query returned %2\$d node(s) and took %1\$.5f seconds.\n\n", microtime(true) - $start, count($result->node));
}