I am working on a websocket implementation and do not know what the sense of a mask is in a frame.
Could somebody explain me what it does and why it is recommend?
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len | Extended payload length |
|I|S|S|S| (4) |A| (7) | (16/64) |
|N|V|V|V| |S| | (if payload len==126/127) |
| |1|2|3| |K| | |
+-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
| Extended payload length continued, if payload len == 127 |
+ - - - - - - - - - - - - - - - +-------------------------------+
| |Masking-key, if MASK set to 1 |
+-------------------------------+-------------------------------+
| Masking-key (continued) | Payload Data |
+-------------------------------- - - - - - - - - - - - - - - - +
: Payload Data continued ... :
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
| Payload Data continued ... |
+---------------------------------------------------------------+
Websockets are defined in RFC6455, which states in Section 5.3:
In a blog entry about Websockets I found the following explanation:
But the most clearly answer I found in an mailing list archive. There John Tamplin states:
As kmkaplan stated, the attack vector is described in Section 10.3 of the RFC.
This is a measure to prevent proxy cache poisoning attacks (see Paper about the attack). What it does, is creating some randomness. You have to XOR the payload with the random masking-key.
By the way: It isn't just recommended. It is obligatory.
From this article:
So assuming attackers were able to compromise both the JavaScript code executed in a browser as well as the the backend server, masking is designed to prevent the the sequence of bytes sent between these two endpoints being crafted in a special way that could disrupt any broken proxies between these two endpoints (by broken this means proxies that might attempt to interpret a websocket stream as HTTP when in fact they shouldn't).
The browser (and not the JavaScript code in the browser) has the final say on the randomly generated mask used to send the message which is why it's impossible for the attackers to know what the final stream of bytes the proxy might see will be.
Note that the mask is redundant if your WebSocket stream is encrypted (as it should be). Article from the author of Python's Flask: