When should one use CONNECT and GET HTTP methods a

2019-01-16 16:19发布

问题:

I'm building a WebClient library. Now I'm implementing a proxy feature, so I am making some research and I saw some code using the CONNECT method to request a URL.

But checking it within my web browser, it doesn't use the CONNECT method but calls the GET method instead.

So I'm confused. When I should use both methods?

回答1:

A CONNECT request urges your proxy to establish an HTTP tunnel to the remote end-point. Usually is it used for SSL connections, though it can be used with HTTP as well (used for the purposes of proxy-chaining and tunneling)

CONNECT www.google.com:443 

The above line opens a connection from your proxy to www.google.com on port 443. After this, content that is sent by the client is forwarded by the proxy to www.google.com:443.

If a user tries to retrieve a page http://www.google.com, the proxy can send the exact same request and retrieve response for him, on his behalf.

With SSL(HTTPS), only the two remote end-points understand the requests, and the proxy cannot decipher them. Hence, all it does is open that tunnel using CONNECT, and lets the two end-points (webserver and client) talk to each other directly.

Proxy Chaining:

If you are chaining 2 proxy servers, this is the sequence of requests to be issued.

GET1 is the original GET request (HTTP URL)
CONNECT1 is the original CONNECT request (SSL/HTTPS URL or Another Proxy)

User Request ==CONNECT1==> (Your_Primary_Proxy ==CONNECT==> AnotherProxy-1 ... ==CONNECT==> AnotherProxy-n) ==GET1(IF is http)/CONNECT1(IF is https)==> Destination_URL


回答2:

TL;DR a web client uses CONNECT only when it knows it talks to a proxy and the final URI begins with https://.

Yes I answer after 4 years. When a browser says:

CONNECT www.google.com:443 HTTP/1.1

it means:

"Hi proxy, please open a raw TCP connection to google; any following bytes I write, you just repeat over that connection without any interpretation. Oh, and one more thing. Do that only if talk to google directly, but if you use another proxy yourself, instead you just tell them the same CONNECT."

Note how this says nothing about TLS (https). In fact CONNECT is orthogonal to TLS; you can have only one, you can have other, or you can have both of them.

That being said, the intent of CONNECT is to allow end-to-end encrypted TLS session, so the data is unreadable to a proxy (or a whole proxy chain). It works even if a proxy doesn't understand TLS at all, because CONNECT can be issued inside plain HTTP and requires from the proxy nothing more than copying raw bytes around.

But the connection to the first proxy can be TLS (https) although it means a double encryption of traffic between you and the first proxy.

Obviously, it makes no sense to CONNECT when talking directly to the final server. You just start talking TLS and then issue HTTP GET. The end servers normally disable CONNECT altogether.

To a proxy, CONNECT support adds security risks. Any data can be passed through CONNECT, even ssh hacking attempt to a server on 192.168.1.*, even SMTP sending spam. Outside world sees these attacks as regular TCP connections initiated by a proxy. They don't care what is the reason, they cannot check whether HTTP CONNECT is to blame. Hence it's up to proxies to secure themselves against misuse.



回答3:

As a rule of thumb GET is used for plain HTTP and CONNECT for HTTPS

There are more details though so you probably want to read the relevant RFC-s

http://www.ietf.org/rfc/rfc2068.txt http://www.ietf.org/rfc/rfc2817.txt