I am parsing through an XML
document and getting the values of nested tags using asXML()
. This works fine, but I would like to move this data into a MySQL
database whose columns match the tags of the file. So essentially how do I get the tags that asXML()
is pulling text from?
This way I can eventually do something like: INSERT INTO db.table (TheXMLTag) VALUES ('XMLTagText');
This is my code as of now:
$xml = simplexml_load_file($target_file) or die ("Error: Cannot create object");
foreach ($xml->Message->SettlementReport->SettlementData as $main ){
$value = $main->asXML();
echo '<pre>'; echo $value; echo '</pre>';
}
foreach ($xml->Message->SettlementReport->Order as $main ){
$value = $main->asXML();
echo '<pre>'; echo $value; echo '</pre>';
}
This is what my file looks like to give you an idea (So essentially how do I get the tags within [SettlementData], [0], [Fulfillment], [Item], etc. ?):
I would like to move this data into a MySQL database whose columns match the tags of the file.
Your problem is two folded.
The first part of the problem is to do the introspection on the database structure. That is, obtain all table names and obtain the column names of these. Most modern databases offer this functionality, so does MySQL. In MySQL those are the INFORMATION_SCHEMA Tables. You can query them as if those were normal database tables. I generally recommend PDO for that in PHP, mysqli is naturally doing the job perfectly as well.
The second part is parsing the XML data and mapping it's data onto the database tables (you use SimpleXMLElement for that in your question so I related to it specifically). For that you first of all need to find out how you would like to map the data from the XML onto the database. An XML file does not have a 2D structure like a relational database table, but it has a tree structure.
For example (if I read your question right) you identify Message->SettlementReport->SettlementData
as the first "table". For that specific example it is easy as the <SettlementData>
only has child-elements that could represent a column name (the element name) and value (the text-content). For that it is easy:
header('Content-Type: text/plain; charset=utf-8');
$table = $xml->Message->SettlementReport->SettlementData;
foreach ($table as $name => $value ) {
echo $name, ': ', $value, "\n";
}
As you can see, specifying the key assignment in the foreach
clause will give you the element name with SimpleXMLElement. Alternatively, the SimpleXMLElement::getName()
method does the same (just an example which does the same just with slightly different code):
header('Content-Type: text/plain; charset=utf-8');
$table = $xml->Message->SettlementReport->SettlementData;
foreach ($table as $value) {
$name = $value->getName();
echo $name, ': ', $value, "\n";
}
In this case you benefit from the fact that the Iterator provided in the foreach
of the SimpleXMLElement
you access via $xml->...->SettlementData
traverses all child-elements.
A more generic concept would be Xpath here. So bear with me presenting you a third example which - again - does a similar output:
header('Content-Type: text/plain; charset=utf-8');
$rows = $xml->xpath('/*/Message/SettlementReport/SettlementData');
foreach ($rows as $row) {
foreach ($row as $column) {
$name = $column->getName();
$value = (string) $column;
echo $name, ': ', $value, "\n";
}
}
However, as mentioned earlier, mapping a tree-structure (N-Depth) onto a 2D-structure (a database table) might now always be that straight forward.
If you're looking what could be an outcome (there will most often be data-loss or data-duplication) a more complex PHP example is given in a previous Q&A:
- How excel reads XML file?
- PHP XML to dynamic table
Please note: As the matter of fact such mappings on it's own can be complex, the questions and answers inherit from that complexity. This first of all means those might not be easy to read but also - perhaps more prominently - might just not apply to your question. Those are merely to broaden your view and provide and some examples for certain scenarios.
I hope this is helpful, please provide any feedback in form of comments below. Your problem might or might not be less problematic, so this hopefully helps you to decide how/where to go on.
I tried with SimpleXML
but it skips text data. However, using the Document Object Model
extension works.
This returns an array where each element is an array with 2 keys: tag
and text
, returned in the order in which the tree is walked.
<?php
// recursive, pass by reference (spare memory ? meh...)
// can skip non tag elements (removes lots of empty elements)
function tagData(&$node, $skipNonTag=false) {
// get function name, allows to rename function without too much work
$self = __FUNCTION__;
// init
$out = array();
$innerXML = '';
// get document
$doc = $node->nodeName == '#document'
? $node
: $node->ownerDocument;
// current tag
// we use a reference to innerXML to fill it later to keep the tree order
// without ref, this would go after the loop, children would appear first
// not really important but we never know
if(!(mb_substr($node->nodeName,0,1) == '#' && $skipNonTag)) {
$out[] = array(
'tag' => $node->nodeName,
'text' => &$innerXML,
);
}
// build current innerXML and process children
// check for children
if($node->hasChildNodes()) {
// process children
foreach($node->childNodes as $child) {
// build current innerXML
$innerXML .= $doc->saveXML($child);
// repeat process with children
$out = array_merge($out, $self($child, $skipNonTag));
}
}
// return current + children
return $out;
}
$xml = new DOMDocument();
$xml->load($target_file) or die ("Error: Cannot load xml");
$tags = tagData($xml, true);
//print_r($tags);
?>