Get DIV content from external Website

2019-01-07 08:53发布

问题:

I want to get a DIV from an external website with pure PHP.

External website: http://www.isitdownrightnow.com/youtube.com.html

Div text I want from isitdownrightnow (statusup div): <div class="statusup">The website is probably down just for you...</div>

I already tried file_get_contents with DOMDocument and str_get_html, but I could not get it to work.

For example this

$page = file_get_contents('http://css-tricks.com/forums/topic/jquery-selector-div-variable/');
    $doc = new DOMDocument();
    $doc->loadHTML($page);
    $divs = $doc->getElementsByTagName('div');
    foreach($divs as $div) {
        // Loop through the DIVs looking for one withan id of "content"
        // Then echo out its contents (pardon the pun)
        if ($div->getAttribute('class') === 'bbp-template-notice') {
             echo $div->nodeValue;
        }
    }

It will just display an error in the console:

Failed to load resource: the server responded with a status of 500 (Internal Server Error)

回答1:

This is what I always use:

$url = 'https://somedomain.com/somesite/';
$content = file_get_contents($url);
$first_step = explode( '<div id="thediv">' , $content );
$second_step = explode("</div>" , $first_step[1] );

echo $second_step[0];


回答2:

This may be a little overkill, but you'll get the gist.

<?php 

$doc = new DOMDocument;

// We don't want to bother with white spaces
$doc->preserveWhiteSpace = false;

// Most HTML Developers are chimps and produce invalid markup...
$doc->strictErrorChecking = false;
$doc->recover = true;

$doc->loadHTMLFile('http://www.isitdownrightnow.com/check.php?domain=youtube.com');

$xpath = new DOMXPath($doc);

$query = "//div[@class='statusup']";

$entries = $xpath->query($query);
var_dump($entries->item(0)->textContent);

?>


回答3:

I used the xpath method proposed by @mightyuhu and it worked great with his addition of the assignment. Depending on the web page you get the info from and the availability of an 'id' or 'class' which identifies the tag you wish to get, you will have to change the query you use. If the tag has an 'id' assigned to it, you can use this (the sample is for extracting the USD exchange rate):

$query = "//div[@id='USD']";

However, the site developers won't make it so easy for us, so there will be several more 'unnamed' tags to dig into, in my example:

<div id="USD" class="tab">
  <table cellspacing="0" cellpadding="0">
    <tbody>
     <tr>
        <td>Ask Rate</td>
        <td align="right">1.77400</td>
     </tr>
     <tr class="even">
        <td>Bid Rate</td>
        <td align="right">1.70370</td>
     </tr>
     <tr>
        <td>BNB Fixing</td>
        <td align="right">1.735740</td>
     </tr>
   </tbody>
  </table>
</div>

So I had to change the query to get the 'Ask Rate':

$doc->loadHTMLFile('http://www.fibank.bg/en');
$xpath = new DOMXPath($doc);
$query = "//div[@id='USD']/table/tbody/tr/td";

So, I used the query above, but changed the item to 1 instead of 0 to get the second column where the exchange rate is (the first column contains the text 'Ask Rate'):

$entries = $xpath->query($query);
$usdrate = $entries->item(1)->textContent;

Another method is to reference the value directly within the query, which when you don't have names or styles should be done with indexing the tags, which was something I received as knowledge from my Maxthon browser and its "Inspect element' feature combined with the "Copy XPath" right menu option (neat, yeah?):

"//*[@id="USD"]/table/tbody/tr[1]/td[2]"

Notice it also inserts an asterisk (*) after the //, which I have not digged into. In this case you should again get the value with item(0), since there will be no other values.

If you need, you can make any changes to the string you extracted, for example changing the number format to match your preference:

$usdrate = number_format($usdrate, 5, ',', ' ');

I hope someone will find this helpful, as I found the answers above, and will spare this someone time in searching for the correct query and syntax.



回答4:

$contents = file_get_contents($url); 

  $title = explode('<div class="entry-content">',$contents); 
  $title = explode("</div>",$title[1]); 

$fp = fopen ("s.php", "w+"); 
fwrite ($fp, "$title[0]"); 
fclose ($fp); 
require_once('s.php');