HTTP file download with Javascript

2019-04-07 23:56发布

问题:

Is there any way (in Javascript) to download a remote website (i.e. like with Curl), read it into a string variable and further process it?

回答1:

You can only download a file from the same domain, as per the Same Origin Policy. You can download content from the same domain though, using the XMLHTTPRequest object:

 var xhReq = createXMLHttpRequest();
 xhReq.open("GET", "page.html", true);
 xhReq.onreadystatechange = onResponse;
 xhReq.send(null);
 ...
 function onResponse() {
   if (xhReq.readyState != 4)  { return; }
   var serverResponse = xhReq.responseText;
   ...
 }

There are ways to circumvent the policy, some of them listed in the same Wikipedia page. But it's a hack at best and illegal at worst.



回答2:

Sure- The url must be from the same domain, unless the url has a cross domain policy or you create a server side proxy script.

The following code is an example of an ajax call to any domain through a proxy PHP script:


var xmlhttp =  new XMLHttpRequest();
xmlhttp.open("POST","http://localhost/proxy.php?url=http://google.com", true);
xmlhttp.onreadystatechange = function() {
    if (request.readyState == 4 && request.status == 200) {
      // ensure we have a response...
      if (xmlhttp.responseText) {
         var html =  xmlhttp.responseText;
         // do your processing here...
      }
    }
};
xmlhttp.send();

You then would make your proxy.php script connect to the given url via Curl (or whatever url library your sever side language has) and then simply echo the content from your domain...


<?php

// proxy.php

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL,$_GET["url"]);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
$result=curl_exec ($ch);
curl_close ($ch);
echo $result;

?>


hope that all makes sense.



回答3:

No. Javascript is restricted to the domain it is running on.



回答4:

You can use the Yahoo Query Language to query any page on the web.

For example, if you want the full source of the Google homepage, you could use:

select * from html where url="http://google.com" and xpath='/html' limit 1

You'd have to use their JSON callback and reserialize the returned object, but you'd be able to get a full view of the page.



回答5:

Mostly you won't be allowed. Javascript will prevent you doing this for security reasons. However, you can request json data from other domains using jQuery. Here is an example from the jquery docs that gets some cat pictures from flickr...

$.getJSON("http://api.flickr.com/services/feeds/photos_public.gne?tags=cat&tagmode=any&format=json&jsoncallback=?",
    function(data){
      $.each(data.items, function(i,item){
        $("<img/>").attr("src", item.media.m).appendTo("#images");
        if ( i == 4 ) return false;
      });
    });

You can find this code in the jQuery Docs. As you can see, this makes a request, gets the data back and updates some image tags in the DOM with the cat pictures...