Why WebSocket can share the 80 port with HTTP “aft

2019-03-22 00:10发布

问题:

As I understand:

  • A port designates a program on the server.
  • When we say to share a port, it actually means to have the requests processed by the same program listening on that port.

The WebSocket handshake resembles the HTTP format, so it can be understood by the server program that handles HTTP protocol. So it's OK to send the handshake request to port 80.

But after the handshake, the WebSocket data format is totally different from HTTP format, how could it still be sent to port 80? Such as via URL like below:

ws://somehost:80/chat

How does it work out?

My guess:

Does the HTTP program see that the incoming request on port 80 cannot be handled as HTTP, and then it will pass it to WebSocket program to process it. If so, what if there's some other protocol that wants to share port 80, say WebSocket2, how could HTTP program know which protocol to pass on to if there's not a way to identify the protocol being used.

ADD 1

Based on jfriend00's reply, I draw the following diagram:

So WebSocket and HTTP traffic in the same browser are actually carried out through different socket connections. Though they both start by connecting to server's port 80.

I think if the word WebSocket doesn't contain a socket in it, it will be easier to understand it as just another application level protocol over TCP protocol.

ADD 2

I refined the above diagram to below based on jfriend00's further comments. What I want to show is how WebSocket communication and HTTP communication to the same server coexist in a browser.

ADD 3

After reading this thread, I recalled that the server port doesn't change when server accept a connection: Does the port change when a TCP connection is accepted by a server?

So the diagram should be like this:

The TCP connection for HTTP and the TCP connection for WebSocket should be using different client ports.

回答1:

When a server listens on a given port, it is listening for incoming connections. When a new incoming connection arrives, it is given its own socket to run on. That socket provides the connection between the two endpoints. From then on, that socket runs completely independently from all other sockets that might also be connected.

So, one incoming http request can specify the "upgrade" header and get upgraded to webSocket and then both ends agree to talk the webSocket protocol from then on. Meanwhile, other incoming http requests without that upgrade header are treated only as normal http requests.

In case you don't quite understand how the webSocket protocol works, you can get a full look at how it connects here.

Here are the main steps:

  1. The client requesting a webSocket connection, sends an HTTP request to the server on port 80.
  2. That HTTP request is a perfectly legal HTTP request, but it has a header included on it Upgrade: websocket.
  3. If the server supports the webSocket protocol, then it responds with a legal HTTP response with a 101 status code that includes a header Connection: Upgrade.
  4. At that point, both sides then switch protocols to the webSocket protocol and all future communication on that socket is done using the data format for the webSocket frame.

Any other incoming HTTP requests that do not contain the upgrade request header are treated as normal HTTP requests.

Does the HTTP program see that the incoming request on port 80 cannot be handled as HTTP, and then it will pass it to WebSocket program to process it.

No, the first request IS a legal HTTP request (just with a special header in it) and the response sent back is a legal HTTP response. But, after that response, both sides switch protocols to webSocket. So a custom header is used to tell the web server that this incoming HTTP request is meant to be the first step in establishing a webSocket connection.

If so, what if there's some other protocol that wants to share port 80, say WebSocket2, how could HTTP program know which protocol to pass on to if there's not a way to identify the protocol being used.

This upgrade mechanism could be used to support other protocols too by just specifying a different protocol name Upgrade: someOtherProtocol though I'm not aware of any others that have been standardized.