I am using some code to pick out all the <td>
tags from a HTML page:
$dom = new DOMDocument;
$dom->loadHTML($html);
foreach ($dom->getElementsByTagName('td') as $node) {
$array_data[ ] = $node->nodeValue;
}
This stores the data fine in my array.
The html data being looked at is:
<tr>
<td>DATA 1</td>
<td><a href="12345">DATA 2</a></td>
<td>DATA 3</td>
</tr>
The $array_data
returns:
Array([0])=>DATA 1 [1]=>DATA 2 [2]=> DATA 3)
My desired output is to get code out of the <a>
tag that is associated with the on the page. Desired output:
Array([0])=>DATA 1 [1]=>12345 [2]=>DATA 2 [3]=> DATA 3)
I think <a>
would be called child node, I am very new to working with DOM sorry if this seems a stupid question.
I have read SO link: Using PHP dom to get child elements
I've used this code to pick out the href:
foreach ($dom->getElementsByTagName('td') as $node) {
foreach ($node->getElementsByTagName('a') as $node){
$link = $node->getAttribute('href');
echo '<br>';
echo $link;
}
$array_data[ ] = $node->nodeValue;
}
Any help or pointers for other reading material would be greatly appreicated!
Thanks
You should check
td
hasa
child. Select anchor tag usinggetElementsByTagName()
and check the selection has content usinglength
property. If thetd
has anchor in child, usegetAttribute()
to gethref
attribute of it.See demo