Please try this yourself :) !
curl http://www.windowsphone.com/en-US/apps?list=free
the result is:
<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="https://login.live.com/login.srf?wa=wsignin1.0&rpsnv=11&checkda=1&ct=1320735308&rver=6.1.6195.0&wp=MBI&wreply=http:%2F%2Fwww.windowsphone.com%2Fen-US%2Fapps%3Flist%3Dfree&lc=1033&id=268289">here</a>.</h2>
</body></html>
or
def download(source_url):
try:
socket.setdefaulttimeout(10)
agents = ['Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)','Mozilla/4.0 (compatible; MSIE 7.0b; Windows NT 5.1)','Microsoft Internet Explorer/4.0b1 (Windows 95)','Opera/8.00 (Windows NT 5.1; U; en)']
ree = urllib2.Request(source_url)
ree.add_header('User-Agent',random.choice(agents))
resp = urllib2.urlopen(ree)
htmlSource = resp.read()
return htmlSource
except Exception, e:
print e
return ""
download('http://www.windowsphone.com/en-US/apps?list=free')
the result is:
<html><head><meta http-equiv="REFRESH" content="0; URL=http://www.windowsphone.com/en-US/apps?list=free"><script type="text/javascript">function OnBack(){}</script></head></html>
I want to download the actual source of the webpage.
Flesk really has the answer on this one (+1).
Another straight-forward way to debug HTTP connections is Netcat, which is basically a powerful telnet utility.
So let's say you want to debug what's going on in your HTTP request:
That will send the request header to the server (you'll need to press the enter key twice to send).
After that, the server will respond:
So the server returns 302, which is the HTTP status code for redirect and thereby prompts the "browser" to open the URL passed in the Location-header.
Netcat is a great tool to debug and trace all kinds of network communication and helped me a lot when I wanted to dig a little deeper into the HTTP protocol.
The reason it fails is because http://www.windowsphone.com attempts to set a cookie, which is checked on https://login.live.com which creates another cookie and redirects back to windowsphone.com if successful.
You should look into http://docs.python.org/library/cookielib.html
If you want to use curl, allow it to create a cookie-file like so:
Run
more myCookieJar
in your shell and you'll see something like this:Run (notice the -b option before 'myCookieJar'):
and you'll get the contents of the page in the file windowsphone.html as you see it in your browser.