Using domDocument, and parsing info, I would like

2019-01-27 01:32发布

问题:

Possible Duplicate:
Regular expression for grabbing the href attribute of an A element

This displays the what is between the a tag, but I would like a way to get the href contents as well.

Is there a way to do that using the domDocument?

$html = file_get_contents($uri);
$html = utf8_decode($html);

/*** a new dom object ***/
$dom = new domDocument;

/*** load the html into the object ***/
@$dom->loadHTML($html);

/*** discard white space ***/
$dom->preserveWhiteSpace = false;

/*** the table by its tag name ***/
$tables = $dom->getElementsByTagName('table');

/*** get all rows from the table ***/
$rows = $tables->item(0)->getElementsByTagName('tr');

/*** loop over the table rows ***/
foreach ($rows as $row)
{
    $a = $row->getElementsByTagName('a');
    /*** echo the values ***/
    echo $a->item(0)->nodeValue.'<br />';
    echo '<hr />';
}

回答1:

You're mere inches away from the answer -- you've already extracted the <a> tags inside your foreach loop. You're grabbing all of them in a DOMNodeList, so each item in that list will be an instance of DOMElement, which has a method called getAttribute.

$a->item(0)->getAttribute('href') will contain the string value of the href attribute. Tada!


It's possible that you might get an empty node list. You can work around this by checking that the first item in the list is an element.

$href = null;
$first_anchor_tag = $a->item(0);
if($first_anchor_tag instanceof DOMElement)
    $href = $first_anchor_tag->getAttribute('href');


回答2:

echo $a->getAttributeNode('href')->nodeValue."<br />";