php code to extract all text links not image link

2019-07-31 13:57发布

问题:

I want to extract all text link from a webpage using simplehtmldom class. But i don't want image links.

<?
foreach($html->find('a[href]') as $element)
       echo $element->href . '<br>'; 
?>

above code shows all anchor links containing href attribute.

<a href="/contact">contact</a>
<a href="/about">about</a>
<a herf="/home"><img src="logo.png" /><a>

i want only /contact and /about not /home because it contains image instead of text

回答1:

<?php

foreach($html->find('a[href]') as $element)
{
    if (empty(trim($element->plaintext)))
        continue;

    echo $element->href . '<br>';
}


回答2:

<?
foreach($html->find('a[href]') as $element){
    if(!preg_match('%<img%', $element->href)){
        echo $element->href . '<br>';    
    }
}
?>


回答3:

It is possible to do that in css and with phpquery as:

$html->find('a:not(:has(img))')

This is not a feature that will likely ever come to simple though.