Parse RDF XML file to get all rdf:about values

2019-07-19 22:43发布

问题:

I am using php's simple xml and xpath to parse an rdf xml file and am struggling to get a list of all the rdf:about values.

Any advice?

回答1:

There seems to be an issue when using SimpleXml with namespaced attributes prior to PHP5.3. Basically, anything with a : will be dropped when converted to an object property of a SimpleXml element. The following will do, but feels hackish to me:

$rdf = str_replace('rdf:about', 'rdf_about', $rdf);  
$rdf = new SimpleXMLElement($rdf);
foreach($rdf->xpath('//@rdf_about') as $node) {
  echo $node, PHP_EOL;
}

See here:

  • http://groups.google.com/group/comp.lang.php/browse_thread/thread/d2a9b29ee21f7403/c6b24b6d398ece2c

You could use DOM instead of SimpleXml:

$dom = new DomDocument;
$dom->loadXml($rdf);
$xph = new DOMXPath($dom);
$xph->registerNamespace('rdf', "http://www.w3.org/1999/02/22-rdf-syntax-ns#");
foreach($xph->query('//@rdf:about') as $attribute) {
    echo $attribute->value, PHP_EOL;
}

But, I suggest using a dedicated library for this over SimpleXml or DOM:

  • http://arc.semsol.org/docs/v2/parsing
  • http://www.seasr.org/wp-content/plugins/meandre/rdfapi-php/doc/
  • http://librdf.org/raptor/
  • http://phpxmlclasses.sourceforge.net/show_doc.php?class=class_rdf_parser.html

And here's a blog post about the parsers:

  • http://www.wasab.dk/morten/blog/archives/2004/05/31/easy-rdf-parsing-with-php