TCP connection reset occurs when WSGI app responds

2020-07-18 10:47发布

问题:

For our webservice, I wrote some logic to prevent multipart/form-data POSTs larger than, say, 4mb.

It boils down to the following (I've stripped away all WebOb usage and just reduced it to plain vanilla WSGI code):

import paste.httpserver

form = """\
<html>
<body>
  <form method="post" enctype="multipart/form-data" action="/">
    <input type="file" name="photopicker" />
    <input type="submit" />
  </form>
</body>
</html>
"""

limit = 4 * 1024 * 1024

def upload_app(environ, start_response):
    if environ['REQUEST_METHOD'] == 'POST':
        if int(environ.get('CONTENT_LENGTH', '0')) > limit:
            start_response('400 Ouch', [('content-type', 'text/plain')])
            return ["Upload is too big!"]
    # elided: consume the file appropriately
    start_response('200 OK', [('content-type', 'text/html')])
    return [form]

paste.httpserver.serve(upload_app, port=7007)

The logic shown works right when unit tested. But as soon as I tried sending actual files larger than 4mb to this endpoint, I got errors like these on the client side:

  • Error 101 (net::ERR_CONNECTION_RESET): Unknown error. from Google Chrome
  • The connection to the server was reset while the page was loading. from Firefox

Same error occurs when using Python built-in wsgiref HTTP server.

Fact: once I added environ['wsgi.input'].read() just before responding with HTTP 400, the connection reset problem went away. Of course, this is not a good fix. It just shows what happens when you fully consume the input.

I perused HTTP: The Definitive Guide and found some interesting guidelines on how it was important to mangage TCP connections carefully when implementing HTTP servers and clients. It went on about how, instead of close-ing socket, it was preferred to do shutdown, so that the client had chance to react and stop sending more data to server.

Perhaps I am missing some crucial implementation detail that prevents such connection resets. Insights anyone?

See the gist.

回答1:

This is happening because you are discarding the input stream without reading it, and this is forcing it closed. The browser has queued up a good portion of the file to be sent already and then it gets a write error because the server closes the connection forcefully.

There is no way around this that I know of without reading all the input.

I would recommend some Javascript to test the size of the file before it is sent. Then the only people who get the error are those who are ignoring the client-side check because they don't have Javascript or because they are purposefully trying to be malicious.