What I'd like to do is retrieve some data from wikipedia, with ajax. I left the client-side scripting for afterwards and tried retrieving some random content. I tried with the fopen() and fread() methods but it didn't work, and then I came around some article that had the code for internet-providers that used proxies. Since it's my case I tried the code below but it didn't give any response.
<?php
$opts = array('http' => array('proxy' => 'tcp://10.10.10.101:8080', 'request_fulluri' => true));
$context = stream_context_create ($opts);
$data = file_get_contents('http://www.php.net', false, $context);
echo $data;
?>
Ok so I tried the code suggested, with the proper proxy values:
<?php
$url = 'http://www.php.net';
$proxy = '10.10.10.101:8080';
//$proxyauth = 'user:password';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_PROXY, $proxy);
//curl_setopt($ch, CURLOPT_PROXYUSERPWD, $proxyauth);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
$curl_scraped_page = curl_exec($ch);
curl_close($ch);
echo $curl_scraped_page;
But it gives me this error: HTTP/1.0 403 Forbidden Date: Mon, 02 Jul 2012 09:41:20 GMT Server: Apache Content-Type: text/plain Destination host forbidden
I don't get why it doesn't work, and how I could solve the problem.
it's not really a cross domain problem because you are loading the data from the server not the browser.
To load a web page from PHP via a proxy - it's best to use cURL (a PHP http client: http://php.net/manual/en/book.curl.php).
Here is an example - it is taken from a similar question (http://stackoverflow.com/questions/5211887/how-to-use-curl-via-a-proxy):
If your proxy needs authentication - you can set the $proxyauth var...
I just tested your code - simply using my own proxy address - and it works.
So, what you're seeing is probably the proxy itself, which does not allow (some - or all external?) sites to be reached. Maybe all you need is authenticating with the proxy.
This probably means that you won't be able to do this via get_contents, curl, fsockopen, or any other way until you've cleared this with the network administrators.