How do I render javascript from another site, insi

2019-02-25 20:25发布

问题:

What I'm trying to do is read a specific line from a webpage from inside of my PHP application. This is my experimental setup thus far:

      <?php
           $url = "http://www.some-web-site.com";
           $file_contents = file_get_contents($url);
           $findme   = 'text to be found';
           $pos = strpos($file_contents, $findme);
           if ($pos == false) {
                echo "The string '$findme' was not found in the string";
           } else {
                echo "The string '$findme' was found in the string";
                echo " and exists at position $pos";
           }
      ?>

The "if" statements contain echo operators for now, this will change to database operators later on, the current setup is to test functionality.

Basically the problem is, with using this method any java on the page is returned as script. What I need is the text that the script is supposed to render inside the browser. Is there any way to do this within PHP?

What I'm ultimately trying to achieve is updating stock from within an ecommerce site via reading the stock level from the site's supplier. The supplier does not use RSS feeds for this.

回答1:

cURL does not have a javascript parser. as such, if the content you are trying to read is placed in the page via Javascript after initial page render, then it will not be accesible via cURL.



回答2:

The result of the script is supposed executed and return back to your script. PHP doesn't support any feature about web browser itself.

I suggest you try to learn about "web crawler" and "webbrowsers" which are included in .NET framework ( not PHP )

so that you can use the exec() command in php to call it.

try to find out the example code of web crawler and web browsers on codeproject.com

hope it works.



回答3:

You can get the entire web page as a file like this:

function get_data($url)
{
$ch = curl_init();
$timeout = 5;
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,$timeout);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$returned_content = get_data('http://example.com/page.htm');
$my_file = 'file.htm';
$handle = fopen($my_file, 'w') or die('Cannot open file: '.$my_file);
fwrite($handle, $returned_content);

Then I suppose you can use a class such as explained in this link below as a guide to separate the javascript from the html (its in the head tags usually). for linked(imported) .js files you would have to repeat the function for those urls, and also for linked/imported css. You can also grab images if you need to save them as files. http://www.digeratimarketing.co.uk/2008/12/16/curl-page-scraping-script/