Get XML tags from asXML()

2019-07-23 18:59发布

I am parsing through an XML document and getting the values of nested tags using asXML(). This works fine, but I would like to move this data into a MySQL database whose columns match the tags of the file. So essentially how do I get the tags that asXML() is pulling text from?

This way I can eventually do something like: INSERT INTO db.table (TheXMLTag) VALUES ('XMLTagText');

This is my code as of now:

$xml = simplexml_load_file($target_file) or die ("Error: Cannot create object");

foreach ($xml->Message->SettlementReport->SettlementData as $main ){
    $value = $main->asXML();
    echo '<pre>'; echo $value; echo '</pre>';
}

foreach ($xml->Message->SettlementReport->Order as $main ){
    $value = $main->asXML();
    echo '<pre>'; echo $value; echo '</pre>';
}

This is what my file looks like to give you an idea (So essentially how do I get the tags within [SettlementData], [0], [Fulfillment], [Item], etc. ?): enter image description here

2条回答
2楼-- · 2019-07-23 19:31

I would like to move this data into a MySQL database whose columns match the tags of the file.

Your problem is two folded.

The first part of the problem is to do the introspection on the database structure. That is, obtain all table names and obtain the column names of these. Most modern databases offer this functionality, so does MySQL. In MySQL those are the INFORMATION_SCHEMA Tables. You can query them as if those were normal database tables. I generally recommend PDO for that in PHP, mysqli is naturally doing the job perfectly as well.

The second part is parsing the XML data and mapping it's data onto the database tables (you use SimpleXMLElement for that in your question so I related to it specifically). For that you first of all need to find out how you would like to map the data from the XML onto the database. An XML file does not have a 2D structure like a relational database table, but it has a tree structure.

For example (if I read your question right) you identify Message->SettlementReport->SettlementData as the first "table". For that specific example it is easy as the <SettlementData> only has child-elements that could represent a column name (the element name) and value (the text-content). For that it is easy:

header('Content-Type: text/plain; charset=utf-8');
$table = $xml->Message->SettlementReport->SettlementData;
foreach ($table as $name => $value ) {
    echo $name, ': ', $value, "\n";
}

As you can see, specifying the key assignment in the foreach clause will give you the element name with SimpleXMLElement. Alternatively, the SimpleXMLElement::getName() method does the same (just an example which does the same just with slightly different code):

header('Content-Type: text/plain; charset=utf-8');
$table = $xml->Message->SettlementReport->SettlementData;
foreach ($table as $value) {
    $name = $value->getName();
    echo $name, ': ', $value, "\n";
}

In this case you benefit from the fact that the Iterator provided in the foreach of the SimpleXMLElement you access via $xml->...->SettlementData traverses all child-elements.

A more generic concept would be Xpath here. So bear with me presenting you a third example which - again - does a similar output:

header('Content-Type: text/plain; charset=utf-8');
$rows = $xml->xpath('/*/Message/SettlementReport/SettlementData');
foreach ($rows as $row) {
    foreach ($row as $column) {
        $name  = $column->getName();
        $value = (string) $column;
        echo $name, ': ', $value, "\n";
    }
}

However, as mentioned earlier, mapping a tree-structure (N-Depth) onto a 2D-structure (a database table) might now always be that straight forward.

If you're looking what could be an outcome (there will most often be data-loss or data-duplication) a more complex PHP example is given in a previous Q&A:

Please note: As the matter of fact such mappings on it's own can be complex, the questions and answers inherit from that complexity. This first of all means those might not be easy to read but also - perhaps more prominently - might just not apply to your question. Those are merely to broaden your view and provide and some examples for certain scenarios.

I hope this is helpful, please provide any feedback in form of comments below. Your problem might or might not be less problematic, so this hopefully helps you to decide how/where to go on.

查看更多
地球回转人心会变
3楼-- · 2019-07-23 19:41

I tried with SimpleXML but it skips text data. However, using the Document Object Model extension works.

This returns an array where each element is an array with 2 keys: tag and text, returned in the order in which the tree is walked.

<?php
// recursive, pass by reference (spare memory ? meh...)
// can skip non tag elements (removes lots of empty elements)
function tagData(&$node, $skipNonTag=false) {

   // get function name, allows to rename function without too much work
   $self = __FUNCTION__;

   // init
   $out      = array();
   $innerXML = '';

   // get document
   $doc = $node->nodeName == '#document'
      ? $node
      : $node->ownerDocument;

   // current tag
   // we use a reference to innerXML to fill it later to keep the tree order
   // without ref, this would go after the loop, children would appear first
   // not really important but we never know
   if(!(mb_substr($node->nodeName,0,1) == '#' && $skipNonTag)) {
      $out[] = array(
         'tag'  => $node->nodeName,
         'text' => &$innerXML,
      );
   }

   // build current innerXML and process children
   // check for children
   if($node->hasChildNodes()) {

      // process children
      foreach($node->childNodes as $child) {

         // build current innerXML
         $innerXML .= $doc->saveXML($child);

         // repeat process with children
         $out = array_merge($out, $self($child, $skipNonTag));
      }
   }

   // return current + children
   return $out;
}

$xml = new DOMDocument();
$xml->load($target_file) or die ("Error: Cannot load xml");
$tags = tagData($xml, true);
//print_r($tags);
?>
查看更多
登录 后发表回答