InvalidArgumentException - The current node list i

2019-09-19 06:53发布

问题:

I am using goutte sracper to scrape the data , i m getting error like InvalidArgumentException - The current node list is empty. Below is the code which i m using

$string = $crawler->filter('div#links.results')->html();


        if ( empty( $string ) )
         return false;

        $dom = new \DOMDocument;
        $state = libxml_use_internal_errors(true);
        $dom->loadHTML($string);
        libxml_use_internal_errors($state);

        $xp = new \DOMXPath($dom);
        $divNodeList = $xp->query('//div[contains(@class, "results_links_deep")]
                                    [contains(@class, "web-result")]
                               /div[contains(@class, "links_main")]
                                   [contains(@class, "links_deep")]
                                   [contains(@class, "result__body")]');

        $results = [];

        if(count($divNodeList) > 0){


            foreach ($divNodeList as $divNode) {

                $results[] = [
                    trim($xp->evaluate('string(./h2/a[@class="result__a"])', $divNode)),
                    trim($xp->evaluate('string(.//a[@class="result__snippet"])', $divNode)),
                    trim($xp->evaluate('string(.//a[@class="result__url"])', $divNode))
                ];
            }


        }

I tried using the solution as below

if ($crawler->filter('div#links.results')->count() > 0 ) {
    $string = $crawler->filter('div#links.results')->html()
}

then it started giving another error like DOMDocument::loadHTML(): Empty string supplied as input

Any suggestions please ?

Your filterdid not return any results. That is why it crashed. That's how I solved this issue, by adding a try catch.

try {
   $string = $crawler->filter('div#links.results')->html()
} catch (\InvalidArgumentException $e) {
    // Handle the current node list is empty..
}