I am trying to get the XML from a Data Service at my institution using PHP and cURL (libcurl). The development is being done on my local machine. It is code that is eval'd in PHP as part of Drupal and the Transformations module.
It has SSL support as shown from running:
$curl-config --features
(from libcurl docs)
SSL
IPv6
libz
NTLM
The PHP code being executed:
/**
* Get a web file (HTML, XHTML, XML, image, etc.) from a URL. Return an
* array containing the HTTP server response header fields and content.
* FROM: http://bit.ly/lNIlOu
*/
function get_web_page( $url )
{
$agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_7) AppleWebKit/534.30 (KHTML, like Gecko) Chrome/12.0.742.112 Safari/534.30';
//$agent = 'spider';
$options = array(
CURLOPT_RETURNTRANSFER => true, // return web page if successful
CURLOPT_HEADER => false, // don't return headers
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_ENCODING => "", // handle all encodings
CURLOPT_USERAGENT => $agent, // who am i
CURLOPT_AUTOREFERER => true, // set referer on redirect
CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect
CURLOPT_TIMEOUT => 120, // timeout on response
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
CURLOPT_SSL_VERIFYPEER => false, // Disabled SSL Cert checks
CURLOPT_SSL_VERIFYHOST => false, // Disable host checks ?
);
$ch = curl_init( $url );
curl_setopt_array( $ch, $options );
$content = curl_exec( $ch );
$err = curl_errno( $ch );
$errmsg = curl_error( $ch );
$header = curl_getinfo( $ch );
curl_close( $ch );
$header['errno'] = $err;
$header['errmsg'] = $errmsg;
$header['content'] = $content;
return $header;
}
$url = 'https://ws.admin.washington.edu/student/v4/public/section.xml?year=2011&quarter=autumn&curriculum_abbreviation=BIOL&course_number=&id=&search_by=Instructor';
$result = get_web_page($url);
echo '<pre>CURL result:<br/>';
var_dump($result);
echo '</pre>';
A slimmed down version of dumping $ch:
array(24) {
["url"]=>
string(155) "https://ws.admin.washington.edu/student/v4/public/section.xml?year=2011&quarter=autumn&curriculum_abbreviation=BIOL&course_number=&id=&search_by=Instructor"
["content_type"]=>
NULL
["http_code"]=>
int(0)
["header_size"]=>
int(0)
["request_size"]=>
int(237)
...
["ssl_verify_result"]=>
int(20)
["redirect_count"]=>
int(0)
["total_time"]=>
float(120.41427)
...
["connect_time"]=>
float(0.11626)
...
["certinfo"]=>
array(0) {
}
["errno"]=>
int(28)
["errmsg"]=>
string(67) "Operation timed out after 120000 milliseconds with 0 bytes received"
["content"]=>
bool(false)
}
When I visit the site myself it simply loads. I even set the agent signature to be the exact same as my own.
Any help would be appreciated.
For me adding
curl_setopt($ch, CURLOPT_SSLVERSION, 3);
magically solved the issue!It works now when I upload it to my server. I was trying this from my Mac OSX 10.6.7 which seems to block port 443 commonly used for the HTTPS protocol. I could not find a way to open it up or find out why it was blocked.
But my script works fine outside of my local machine.
Thanks for you help so far.