Why are HTTP proxies able to support protocols lik

2020-05-13 02:45发布

问题:

I understand that a SOCKS proxy only establishes a connection at the TCP level while an HTTP proxy interprets traffic at HTTP level. Thus a SOCKS proxy can work for any kind of protocol while an HTTP Proxy can only handle HTTP traffic. But why does an HTTP Proxy like Squid can support protocol like IRC, FTP ? When we use an HTTP Proxy for an IRC or FTP connection, what does specifically happen? Is there any metadata added to the package when it is sent to the proxy over the HTTP protocol?

回答1:

HTTP proxy is able to support high level protocols other than HTTP if it supports CONNECT method, which is primarily used for HTTPS connections, here is description from Squid wiki:

The CONNECT method is a way to tunnel any kind of connection through an HTTP proxy. By default, the proxy establishes a TCP connection to the specified server, responds with an HTTP 200 (Connection Established) response, and then shovels packets back and forth between the client and the server, without understanding or interpreting the tunnelled traffic

If client software supports connection through 'HTTP CONNECT'-enabled (HTTPS) proxy it can be any high level protocol that can work with such a proxy (VPN, SSH, SQL, version control, etc.)



回答2:

As others have mentioned, the "HTTP CONNECT" method allows you to establish any TCP-based connection via a proxy. This functionality is needed primarily for HTTPS connections, since for HTTPS connections, the entire HTTP request is encrypted (so it appears to the proxy as a "meaningless" TCP connection). In other words, an HTTPS session over a proxy, or a SSH/FTPS session over a proxy, will both appear as "encrypted sessions" to the proxy, and it won't be able to tell them apart, so it has to either allow them all or none of them.

During normal operation, the HTTP proxy receives the HTTP request, and is "smart enough" to understand the request to be able to do high level things with it (e.g. search its cache to see if it can serve the response without going to the destination server, or consults a whitelist/blacklist to see if this URL is allowed, etc.). In "CONNECT" mode, none of this happens. The proxy establishes a TCP connection to the destination server, and simply forwards all traffic from the client to the destination server and all traffic from the destination server to the client. That means any TCP protocol can work (HTTPS, SSH, FTP - even plain HTTP)