The Django documentation on its CSRF protection states that:
In addition, for HTTPS requests,
strict referer checking is done by
CsrfViewMiddleware. This is necessary
to address a Man-In-The-Middle attack
that is possible under HTTPS when
using a session independent nonce, due
to the fact that HTTP 'Set-Cookie'
headers are (unfortunately) accepted
by clients that are talking to a site
under HTTPS. (Referer checking is not
done for HTTP requests because the
presence of the Referer header is not
reliable enough under HTTP.)
I have trouble visualizing how this attack works. Could somebody explain?
UPDATE:
The wording in the Django doc seems to imply that there is a specific type of man-in-the-middle attack (which leads to a successful CSRF I'd assume) that works with session independent nonce (but not with transaction specific nonce etc., I suppose) and involves the use of 'Set-Cookie' header.
So I wanted to know how that specific type of attack works.
The attacker can set the CSRF cookie using Set-Cookie, and then supply a matching token in the POST form data. Since the site does not tie the session cookies to the CSRF cookies, it has no way of determining that the CSRF token + cookie are genuine (doing hashing etc. of one of them will not work, as the attacker can just get a valid pair from the site directly, and use that pair in the attack).
Directly from the django project
(I googled for session independent nonce.)
Here's a very detailed description of one-such MitM attack. Below is an abridged and simplified adaptation:
Assume that:
- the attacked site is foo.com
- we (the attacker) can MitM all requests
- some pages are served over HTTP (e.g., http://foo.com/browse)
- some pages are served over HTTPS (e.g., https://foo.com/check_out), and those pages are protected by a log-in cookie (w/Secure set). Note that this means we cannot steal the user's login cookie.
- all forms are protected by comparing a form parameter with the csrftoken cookie. As noted in the django docs, it's irrelevant to this attack whether they are "signed" or just random nonces.
Grab a valid CSRF token
- just read the traffic when the users visits http://foo.com/browse
- or, if the tokens are form-specific, we can just log into the site with our own account and get a valid token from http://foo.com/check_out on our own.
MitM to force attacker-controlled POST to HTTPS page with that token:
Modify an HTTP-served page (e.g., http://foo.com/browse) to have an auto-submitting form that submits to an HTTPS POST end-point (e.g., http://foo.com/check_out). Also set their CSRF cookie to match your token:
<script type="text/javascript">
function loadFrame(){
var form=document.getElementById('attackform');
// Make sure that the form opens in a hidden frame so user doesn't notice
form.setAttribute('target', 'hiddenframe');
form.submit();
}
</script>
<form name="attackform" id="attackform" style="display:none" method="POST"
action="http://foo.com/check_out">
<input type="text" name="expensive-thing" value="buy-it-now"/>
<input type="text" name="csrf" value="csrf-token-value"/>
</form>
<iframe name="hiddenframe" style="display:none" id="hiddenframe"></iframe>
<XXX onload="loadFrame();">
The Man-In-The-Middle attack explained in very simplistic terms. Imagine two people are talking to each other and before they start talking to each other, they do a handshake before they initiate a two way communication. When a third person starts to analyze how the two individuals how the two people communicate (What are their mannerisms?, Do they do a special handshake before they speak to each other?, What time do they like to talk to each other, etc), the third person can mold his/her communication to the point the he/she can embed themselves into a conversation and act as a mediator with the original two people thinking that they are speaking with each other.
Now take the concept and bring down to the geek level. When a pc, router, programs etc. communicates with another node unto the network, there is two-way communication occurs either by authentication, acknowledgement, or both. If a third party can determine the sequence of events that is required (session id, session cookie, the next sequence of acknowledge/transfer/termination in the traffic, etc), a malicious third party can mirror its own traffic as a legit node and flood the traffic to one of the legit nodes and if they get the right sequence of events down, the malicious third becomes accepted as a legit node.
Let's say we have a Django-powered site and a malicious Man-In-the-Middle. In the general case the site wouldn't even have to serve http://
pages at all for the attack to succeed. In Django's case, it probably needs to serve at least one CSRF-protected page over plain http://
(see below for the explanation).
The attacker first needs to get a syntactically-valid CSRF token. For some types of token (like a simple random string) she might be able to just make one up. For Django's scrambled tokens she will probably have to get one from an http://
page that includes CSRF (e.g. in a hidden form field).
The key point is that Django's CSRF tokens aren't tied to the user's session or any other saved state. Django will simply look to see if there is a match between the cookie and the form value (or header in the case of AJAX). So any valid token will do.
The user requests a page over http://
. The attacker is free to modify the response since it's unencrypted. She does a Set-Cookie
with her malicious CSRF token, and alters the page to include a hidden form—and the Javascript to submit it—which POSTs
to an https://
endpoint. That form, of course, includes the field with the CSRF value.
When the user's browser load the response, it stores the CSRF cookie specified by the Set-Cookie
header and then runs the Javascript to submit the form. It sends the POST
to the https://
endpoint along with the malicious CSRF cookie.
(The "unfortunate" fact that cookies set over http://
will be sent to https://
endpoints is discussed in the relevant RFC: "An active network attacker can also inject cookies into the Cookie header sent to https://example.com/
by impersonating a response from http://example.com/
and injecting a Set-Cookie
header. The HTTPS server at example.com
will be unable to distinguish these cookies from cookies that it set itself in an HTTPS response. An active network attacker might be able to leverage this ability to mount an attack against example.com
even if example.com
uses HTTPS exclusively.")
Finally, the Django server receives the malicious POST
request. It compares the CSRF cookie (set by the attacker) to the value in the form (set by the attacker) and sees that they are the same. It allows the malicious request.
So, to avoid that result, Django also checks the Referer
header (which is expected to always be set in https://
requests) against the Host
header. That check will fail in the example above because the attacker can't forge the Referer
header. The browser will set it to the http://
page that the attacker used to host her malicious form, and Django will detect the mismatch between that and the https://
endpoint that it's serving.