XML parsing using but Element Names are Dynamic

2019-02-20 18:48发布

问题:

Simple XMLElement Object
(    
         [IpStatus] => 1    
         [ti_pid_20642] => SimpleXmlElement Object    
               (

I have a SimpleXMLElment in above format and this XML is generated at run time and it's node values like ti_pid_20642 are partly dnymaic, for example ti_pid_3232, ti-pid_2323, ti_pid_anyumber.

My question is how can I get these nodes values and it's children using PHP?

回答1:

To get all node names that are used in an XML string with SimpleXML you can use the SimpleXMLIterator:

$tagnames = array_keys(iterator_to_array(
    new RecursiveIteratorIterator(
        new SimpleXMLIterator($string)
        , RecursiveIteratorIterator::SELF_FIRST
    )
));

print_r($tagnames);

Which could give you exemplary (you did not give any XML in your question, Demo):

Array
(
    [0] => IpStatus
    [1] => ti_pid_20642
    [2] => dependend
    [3] => ti-pid_2323
    [4] => ti_pid_anyumber
    [5] => more
)

If you have problems to provide a string that contains valid XML, take your existing SimpleXMLelement and create an XML string out of it:

$string = $simpleXML->asXML();

However, if you like to get all tagnames from a SimpleXML object but you don't want to convert it to a string, you can create a recursive iterator for SimpleXMLElement as well:

class SimpleXMLElementIterator extends IteratorIterator implements RecursiveIterator
{
    private $element;

    public function __construct(SimpleXMLElement $element) {
        parent::__construct($element);
    }

    public function hasChildren() {
        return (bool)$this->current()->children();
    }

    public function getChildren() {
        return new self($this->current()->children());
    }
}

The usage of it would be similar (Demo):

$it      = new RecursiveIteratorIterator(
    new SimpleXMLElementIterator($xml), RecursiveIteratorIterator::SELF_FIRST
);
$tagnames = array_keys(iterator_to_array($it));

It just depends on what you need.

This becomes less straight forward, with namespaced elements. Depending if you want to get the local names only or the namspace names or even the namespace URIs with the tagnames.

The given SimpleXMLElementIterator could be changed to support the iteration over elements across namespaces, by default simplexml only offers traversal over elements in the default namespace:

/**
 * SimpleXMLElementIterator over all child elements across namespaces 
 */
class SimpleXMLElementIterator extends IteratorIterator implements RecursiveIterator
{
    private $element;

    public function __construct(SimpleXMLElement $element) {
        parent::__construct(new ArrayIterator($element->xpath('./*')));
    }

    public function key() {
        return $this->current()->getName();
    }

    public function hasChildren() {
        return (bool)$this->current()->xpath('./*');
    }

    public function getChildren() {
        return new self($this->current());
    }
}

You would then need to check for the namespace per each element- As an example a modified XML document making use of namespaces:

<root xmlns="namspace:default" xmlns:ns1="namespace.numbered.1">
    <ns1:IpStatus>1</ns1:IpStatus>
    <ti_pid_20642>
        <dependend xmlns="namspace:depending">
            <ti-pid_2323>ti-pid_2323</ti-pid_2323>
            <ti_pid_anyumber>ti_pid_anyumber</ti_pid_anyumber>
            <more xmlns:ns2="namspace.numbered.2">
                <ti_pid_20642 ns2:attribute="test">ti_pid_20642</ti_pid_20642>
                <ns2:ti_pid_20642>ti_pid_20642</ns2:ti_pid_20642>
            </more>
        </dependend>
    </ti_pid_20642>
</root>

Combined with the update SimpleXMLIterator above the following example-code demonstrates the new behavior:

$xml = new SimpleXMLElement($string);
$it  = new RecursiveIteratorIterator(
    new SimpleXMLElementIterator($xml), RecursiveIteratorIterator::SELF_FIRST
);

$count = 0;
foreach ($it as $name => $element) {
    $nsList = $element->getNamespaces();
    list($ns, $nsUri) = each($nsList);
    printf("#%d:  %' -20s  %' -4s  %s\n", ++$count, $name, $ns, $nsUri);
}

Output (Demo):

#1:  IpStatus              ns1   namespace.numbered.1
#2:  ti_pid_20642                namspace:default
#3:  dependend                   namspace:depending
#4:  ti-pid_2323                 namspace:depending
#5:  ti_pid_anyumber             namspace:depending
#6:  more                        namspace:depending
#7:  ti_pid_20642                namspace:depending
#8:  ti_pid_20642          ns2   namspace.numbered.2

Have fun.