I have an HTML page containing alot of meta tags and I want to parse them to find certain ones. Here is the code I am using, but it's not picking up any of the tags.
$dom = new DOMDocument();
$dom->preserveWhiteSpace = false;
$dom->loadHtml($contents);
$metaChildren = $dom->getElementsByTagName('meta');
var_dump($metaChildren);
Here is a snippet of the HTML I am using (I replaced the arrow with a brace):
[meta name="GZPlatform" content=" pc"]
[meta name="GZFeatured" content=" Gone Gold"]
[meta name="GZHeadline" content=" pc"]
[meta name="GZP_ID" content=" pc 21153"]
Any Ideas?
Are you sure the tags aren't being matched? What is the output of var_dump
? What value do you get when you use var_dump($metaChildren->length)
? Your code seems to work here:
<?
$dom = new DOMDocument();
$dom->preserveWhiteSpace = false;
$dom->loadHtmlFile('test.html');
$metaChildren = $dom->getElementsByTagName('meta');
for ($i = 0; $i < $metaChildren->length; $i++) {
$el = $metaChildren->item($i);
print $el->getAttribute('name') . '=' . $el->getAttribute('content') . "\n";
}
?>
Gives output:
GZPlatform= pc
GZFeatured= Gone Gold
GZHeadline= pc
GZP_ID= pc 21153
My guess would be that the HTML is not valid and that the $dom->loadHtml
call is failing. I believe that call returns true|false. So maybe something like this:
if($dom->loadHtml($contents)){
$metaChildren = $dom->getElementsByTagName('meta');
}else{
//handle properly
}
COuld it be that the parser expects you to close the meta tags?
<meta name="name" />
or
<meta name="name"></meta>