Python 2.7: streaming HTTP server supporting multi

2019-01-09 14:02发布

问题:

I am looking for a standard Python 2.7 package providing an HTTP server that does simultaneous streaming connections on the same port number.

Hey you moderators out there, please stop flagging my question as a duplicate of questions that want to serve in non-streaming ways, like this one: Multithreaded web server in python. No, I don't want a hack such as ThreadingMixIn that merely collects up the response and returns it as a unit.

In other words, I'm looking for the standard way to do what the following example program does -- but without writing the whole HTTP server myself.

import time, socket, threading

sock = socket.socket (socket.AF_INET, socket.SOCK_STREAM)
host = socket.gethostname()
port = 8000

sock.bind((host, port))
sock.listen(1)

# my OWN HTTP server... Oh man, this is bad style.
HTTP = "HTTP/1.1 200 OK\nContent-Type: text/html; charset=UTF-8\n\n"

class Listener(threading.Thread):

    def __init__(self):
        threading.Thread.__init__(self)
        self.daemon = True # stop Python from biting ctrl-C
        self.start()

    def run(self):
        conn, addr = sock.accept()
        conn.send(HTTP)

        # serve up an infinite stream
        i = 0
        while True:
            conn.send("%i " % i)
            time.sleep(0.1)
            i += 1

[Listener() for i in range(100)]
time.sleep(9e9)

So first I tried:

# run with this command:
#    gunicorn -k gevent myapp:app
import time

def app(environ, start_response):
    data = b"Hello, World!\n"
    start_response("200 OK", [
        ("Content-Type", "text/plain"),
        ("Content-Length", str(len(data)))
    ])
    for i in range(5):
        time.sleep(1)
        yield "Hello %i\n" % i

# https://stackoverflow.com/questions/22739394/streaming-with-gunicorn

but unfortunately it doesn't stream, even with the -k gevent.

Update: it appears that gunicorn is trying to do keepalive, which would require Chunked Transfer Coding with the last-chunk bit. A quick grep of the sources reveals that it's not implementing that. So I might need a much fancier HTTP server, or a simpler one (like my first example above, based on socket) that doesn't bother with keepalive (which is pretty silly for large streams anyway).

So then I tried:

import time
import threading

import BaseHTTPServer

class Handler(BaseHTTPServer.BaseHTTPRequestHandler):

    def do_GET(self):
        if self.path != '/':
            self.send_error(404, "Object not found")
            return
        self.send_response(200)
        self.send_header('Content-type', 'text/html; charset=utf-8')
        self.end_headers()

        # serve up an infinite stream
        i = 0
        while True:
            self.wfile.write("%i " % i)
            time.sleep(0.1)
            i += 1

class Listener(threading.Thread):

    def __init__(self, i):
        threading.Thread.__init__(self)
        self.i = i
        self.daemon = True
        self.start()

    def run(self):
        server_address = ('', 8000+self.i) # How to attach all of them to 8000?
        httpd = BaseHTTPServer.HTTPServer(server_address, Handler)
        httpd.serve_forever()

[Listener(i) for i in range(100)]
time.sleep(9e9)

which is pretty good, but it's a bit annoying that I have to allocate 100 port numbers. This will require an obnoxious client-side redirect to get the browser to the next available port (Well, OK, I can hide it with JavaScript, but it's not so elegant. I'd rather write my own HTTP server than do that).

There must be a clean way to just get all the BaseHTTPServer listeners on one port, as it is such a standard way of setting up a web server. Or maybe gunicorn or somesuch package can be made to stream reliably?

回答1:

The default BaseHTTPServer settings re-bind a new socket on every listener, which won't work in Linux if all the listeners are on the same port. Change those settings between the BaseHTTPServer.HTTPServer() call and the serve_forever() call.

The following example launches 100 handler threads on the same port, with each handler started through BaseHTTPServer.

import time, threading, socket, SocketServer, BaseHTTPServer

class Handler(BaseHTTPServer.BaseHTTPRequestHandler):

    def do_GET(self):
        if self.path != '/':
            self.send_error(404, "Object not found")
            return
        self.send_response(200)
        self.send_header('Content-type', 'text/html; charset=utf-8')
        self.end_headers()

        # serve up an infinite stream
        i = 0
        while True:
            self.wfile.write("%i " % i)
            time.sleep(0.1)
            i += 1

# Create ONE socket.
addr = ('', 8000)
sock = socket.socket (socket.AF_INET, socket.SOCK_STREAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.bind(addr)
sock.listen(5)

# Launch 100 listener threads.
class Thread(threading.Thread):
    def __init__(self, i):
        threading.Thread.__init__(self)
        self.i = i
        self.daemon = True
        self.start()
    def run(self):
        httpd = BaseHTTPServer.HTTPServer(addr, Handler, False)

        # Prevent the HTTP server from re-binding every handler.
        # https://stackoverflow.com/questions/46210672/
        httpd.socket = sock
        httpd.server_bind = self.server_close = lambda self: None

        httpd.serve_forever()
[Thread(i) for i in range(100)]
time.sleep(9e9)