Which is the best way to parse an XML file in PHP ?
First
Using the DOM object
//code
$dom = new DOMDocument();
$dom->load("xml.xml");
$root = $dom->getElementsByTagName("tag");
foreach($root as $tag)
{
$subChild = $root->getElementsByTagName("child");
// extract values and loop again if needed
}
Second
Using the simplexml_load Method
// code
$xml = simplexml_load_string("xml.xml");
$root = $xml->root;
foreach($root as $tag)
{
$subChild = $tag->child;
// extract values and loop again if needed
}
Note :
These are the two I am aware of. If there are more fill in.
Wanted to know which method is the best for parsing huge XML files, also which method is the fastest irrespective of the way the method needs to be implemented
Size will be varying from 500KB to 2MB. The parser should be able to parse small as well as large files in the least amount of time with good memory usage if possible.
It depends on the document you're passing, but XMLReader is usually the faster than both simplexml and DOM (http://blog.liip.ch/archive/2004/05/10/processing_large_xml_documents_with_php.html). Personally though I've never used XMLReader and usually decided which to use depending on whether or not I need to edit it:
- simplexml if I'm just reading a document
- DOM if I'm modifying the DOM and saving it back
You can also convert objects between simplexml and DOM.
I have started to use XMLReader to parse the XML files. After doing a bit of googling around found it the best to way parse XML files as it does not load the whole XML file into memory. Say if suppose my XML files was of 5 MB, while parsing it using XMLReader 5MB of my memory does not get wasted.
//usage
$xml = new XMLReader();
$xml->XML($xmlString);
while($xml->read)
{
if($xml->localName == 'Something') // check if tag name equals something
{
//do something
}
}
Using XML Reader we can find if the current tag is an opening tag or closing tag and do the needful as required.
If you're processing huge files don't parse them. Apply XSLT instead. That'll save you huge amounts of memory and processing time.
I prefer simplexml_load_string for ease of use. Processing speed may well depend on the format of the XML file if the two use different methods of parsing the file - try it out on your own files and see which is better for you.
All XML is handled by simpleXML in PHP now when I develop. It's easily extended and methods overwritten when needed.