Simple DOM file_get_html returns nothing

2019-07-17 16:41发布

问题:

I'm trying to scrape data from some websites. For several sites it all seems to go fine, but for one website it doesn't seem to be able to get any HTML. This is my code:

<?php include_once('simple_html_dom.php');

$html = file_get_html('https://www.magiccardmarket.eu/?mainPage=showSearchResult&searchFor=' . $_POST['data']);

echo $html; ?>

I'm using ajax to fetch the data. When I log the returned value in my js it's completely empty.

Could it be due to the fact that this website is running on https? And if so, is there any way to work around it? (I've tried changed the url to http, but I get the same result)

Update:

If I var_dump the $html variable, I get bool(false).

My PHP error log says this:

[27-Feb-2014 22:20:50 Europe/Amsterdam] PHP Warning: file_get_contents(http://www.magiccardmarket.eu/?mainPage=showSearchResult&searchFor=tarmogoyf): failed to open stream: HTTP request failed! HTTP/1.0 403 Forbidden in /Users/leondewit/PhpstormProjects/Magic/stores/simple_html_dom.php on line 75

回答1:

It's your user agent, file_get_contents doesn't send one by default, so:

$url = 'http://www.magiccardmarket.eu/?mainPage=showSearchResult&searchFor=tarmogoyf';
$context = stream_context_create(array('http' => array('header' => 'User-Agent: Mozilla compatible')));
$response = file_get_contents($url, false, $context);
$html = str_get_html($response);
echo $html;