使用PHP和遍历表中的行和列解析HTML？(Parse html using PHP and loo

我试图从loadHTML解析HTML，但我有麻烦，我经历了所有设法循环<tr>文档中秒，但我不知道通过怎样循环<td> S于每一行。

这是我做过什么至今：

$DOM->loadHTML($url);
$rows= $DOM->getElementsByTagName('tr');

for ($i = 0; $i < $rows->length; $i++) { // loop through rows
    // loop through columns
    ...
}

我怎样才能通过每一行中的列得到循环？

Answer 1:

DOMElement也支持getElementsByTagName ：

$DOM = new DOMDocument();
$DOM->loadHTMLFile("file path or url");
$rows = $DOM->getElementsByTagName("tr");
for ($i = 0; $i < $rows->length; $i++) {
    $cols = $rows->item($i)->getElementsbyTagName("td");
    for ($j = 0; $j < $cols->length; $j++) {
        echo $cols->item($j)->nodeValue, "\t";
        // you can also use DOMElement::textContent
        // echo $cols->item($j)->textContent, "\t";
    }
    echo "\n";
}

Answer 2:

将重新循环的工作？

$DOM->loadHTML($url);
$rows= $DOM->getElementsByTagName('tr');
$tds= $DOM->getElementsByTagName('td');

for ($i = 0; $i < $rows->length; $i++) {
// loop through columns
     for ($i = 0; $i < $tds->length; $i++) {
     // loop through rows

     }

}

编辑您还必须检查parent node ，以确保该rows母公司是tr你目前在他，像这样

if ($rows == tds->parent_node){
// do whatever
}

可能不是语法100％正确，但这个概念是健全的。

Answer 3:

使用DOMXPath查询出孩子柱节点使用相对XPath查询，如下所示：

$xpath = new DOMXPath( $DOM);
$rows= $xpath->query('//table/tr');

foreach( $rows as $row) {
    $cols = $xpath->query( 'td', $row); // Get the <td> elements that are children of this <tr>
    foreach( $cols as $col) {
        echo $col->textContent;
    }
}

编辑：要开始在具体的行和停止，通过改变你如何遍历保持自己的指数上排DOMNodeList ：

$xpath = new DOMXPath( $DOM);
$rows= $xpath->query('//table/tr');

for( $i = 3, $max = $rows->length - 2; $i < $max, $i++) {
    $row = $rows->item( $i);
    $cols = $xpath->query( 'td', $row);
    foreach( $cols as $col) {
        echo $col->textContent;
    }
}

文章来源: Parse html using PHP and loop through table rows and columns?