I want to read whatever is inside the <q:content></q:content>
tags in the following xml -
$xml = '<?xml version="1.0"?>
<q:response xmlns:q="http://api-url">
<q:impression>
<q:content>
<html>
<head>
<meta name="HandheldFriendly" content="True">
<meta name="viewport" content="width=device-width, user-scalable=no">
<meta http-equiv="cleartype" content="on">
</head>
<body style="margin:0px;padding:0px;">
<iframe scrolling="no" src="http://some-url" width="320px" height="50px" style="border:none;"></iframe>
</body>
</html>
</q:content>
<q:cpc>0.02</q:cpc>
</q:impression>
...
... some more things
...
</q:response>';
I have put the xml in the variable above and then I use SimpleXMLElement::getNamespaces as given in the section "Example #1 Get document namespaces in use" -
//code continued
$dom = new DOMDocument;
// load the XML string defined above
$dom->loadXML($xml);
var_dump($dom->getElementsByTagNameNS('http://api-url', '*') ); // shows object(DOMNodeList)#3 (0) { }
foreach ($dom->getElementsByTagNameNS('http://api-url', '*') as $element)
{
//this does not execute
echo 'see - local name: ', $element->localName, ', prefix: ', $element->prefix, "\n";
}
But the code inside the for loop does not execute.
I have read these questions -
Update
Also tried this solution Parse XML with Namespace using SimpleXML -
$xml = new SimpleXMLElement($xml);
$xml->registerXPathNamespace('e', 'http://api-url');
foreach($xml->xpath('//e:q') as $event) {
echo "not coming here";
$event->registerXPathNamespace('e', 'http://api-url');
var_export($event->xpath('//e:content'));
}
In this case too, the code inside the foreach does not execute. Not sure if I wrote everything correct ...
Further Update
Going with the first solution ... with error_reporting = -1, found that the problem is with the URL in the src
attr of the iframe
tag. Getting warnings like -
Warning: DOMDocument::loadXML(): EntityRef: expecting ';' in Entity, line: 13
Updated code -
$xml = '<?xml version="1.0"?>
<q:response xmlns:q="http://api-url">
<q:impression>
<q:content>
<html>
<head>
<meta name="HandheldFriendly" content="True" />
<meta name="viewport" content="width=device-width, user-scalable=no" />
<meta http-equiv="cleartype" content="on" />
</head>
<body style="margin:0px;padding:0px;">
<iframe scrolling="no" src="http://serve.qriously.com/v1/request?type=SERVE&aid=ratingtest&at=2&uid=0000000000000000&noHash=true&testmode=true&ua=Mozilla/5.0 (Linux; U; Android 2.2.1; en-us; Nexus One Build/FRG83) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1&appid=12e2561f048158249e30000012e256826ad&pv=2&rf=2&src=admarvel&type=get&lang=eng" width="320px" height="50px" style="border:none;"></iframe>
</body>
</html>
</q:content>
<q:cpc>0.02</q:cpc>
</q:impression>
<q:app_stats>
<q:total><q:ctr>0.023809523809523808</q:ctr><q:ecpm>0.5952380952380952</q:ecpm></q:total>
<q:today><q:ctr>0.043478260869565216</q:ctr><q:ecpm>1.0869565217391306</q:ecpm></q:today>
</q:app_stats>
</q:response>';
I have no problem to get it to work, the only error I could find is that you're loading XML containing a non-XML HTML chunk in there which is breaking the document: The meta elements in the head section are not closed.
See Demo.
Tip: Always activate error logging and reporting, check for warnings and notices if you develop and debug code. A short one-line displaying all sort of PHP error messages incl. warnings, notices and strict:
DOMDocument is talkative then about malformed elements when loading XML.
Fixing the XML "on the fly"
DomDocument accepts only valid XML. If you've got HTML you can alternatively try if
DOMDocument::loadHTML()
does the job as well, however it will convert the loaded string into a X(HT)ML document then. Probably not what you're looking for.To escape a specific part of the string to load to make it XML compatible you can search for string patterns to obtain the substring that represents the HTML inside the XML and properly XML encode it.
E.g. you can look for
<html>
and</html>
as the surrounding tags, extract the substring of the whole and replace it withsubstr_replace()
. To encode the HTML for being used as data inside the XML, use thehtmlspecialchars()
function, it will replace everything with the five entities in the other SO answer.Some mock-up code: