To begin with, I understand there are other modules such as Requests that would be better suited and simpler to use, but I want to use the socket module to better understand HTTP.
I have a simple script that does the following:
Client ---> HTTP Proxy ---> External Resource (GET Google.com)
I am able to connect to the HTTP proxy alright, but when I send the GET request headers for google.com to the proxy, it doesn't serve me any response at all.
#!/usr/bin/python
import socket
import sys
headers = """GET / HTTP/1.1\r\n
Host: google.com\r\n\r\n"""
socket = socket
host = "165.139.179.225" #proxy server IP
port = 8080 #proxy server port
try:
s = socket.socket()
s.connect((host,port))
s.send(("CONNECT {0}:{1} HTTP/1.1\r\n" + "Host: {2}: {3}\r\n\r\n").format(socket.gethostbyname(socket.gethostname()),1000,port,host))
print s.recv(1096)
s.send(headers)
response = s.recv(1096)
print response
s.close()
except socket.error,m:
print str(m)
s.close()
sys.exit(1)
To make a HTTP request to a proxy open a connection to the proxy server and then send a HTTP-proxy request. This request is mostly the same as the normal HTTP request, but contains the absolute URL instead of the relative URL, e.g.
> GET http://www.google.com HTTP/1.1
> Host: www.google.com
> ...
< HTTP response
To make a HTTPS request open a tunnel using the CONNECT method and then proceed inside this tunnel normally, that is do the SSL handshake and then a normal non-proxy request inside the tunnel, e.g.
> CONNECT www.google.com:443 HTTP/1.1
>
< .. read response to CONNECT request, must be 200 ...
.. establish the TLS connection inside the tunnel
> GET / HTTP/1.1
> Host: www.google.com
Python 3 requires the request to be encoded. Thus, expanding on David's original code, combined with Steffens answer, here is the solution written for Python 3:
def connectThroughProxy():
headers = """GET http://www.example.org HTTP/1.1
Host: www.example.org\r\n\r\n"""
host = "192.97.215.348" #proxy server IP
port = 8080 #proxy server port
try:
s = socket.socket()
s.connect((host,port))
s.send(headers.encode('utf-8'))
response = s.recv(3000)
print (response)
s.close()
except socket.error as m:
print (str(m))
s.close()
sys.exit(1)
This allows me to connect to the example.org host through my corporate proxy (at least for non SSL/TLS connections).