I've written a HTTP-Server that produces endless HTTP streams consisting of JSON-structured events. Similar to Twitter's streaming API. These events are separated by \n
(according to Server-sent events with Content-Type:text/event-stream) and can vary in length.
The response is
- chunked (HTTP 1.1 Transfer-Encoding:chunked) due to the endless stream
- compressed (Content-Encoding: gzip) to save bandwidth.
I want to consume these lines in Python as soon as they arrive and as resource-efficient as possible, without reinventing the wheel.
As I'm currently using python-requests, do you know how to make it work? If you think, python-requests cannot help here, I'm totally open for alternative frameworks/libraries.
My current implementation is based on requests and uses iter_lines(...)
to receive the lines. But the chunk_size
parameter is tricky. If set to 1
it is very cpu-intense, since some events can be several kilobytes. If set to any value above 1, some events got stuck until the next arrive and the whole buffer "got filled". And the time between events can be several seconds.
I expected that the chunk_size
is some sort of "maximum number of bytes to receive" as in unix's recv(...)
. The corresponding man-page says:
The receive calls normally return any data available, up to the requested amount, rather than waiting for receipt of the full amount requested.
But this is obviously not how it works in the requests-library. They use it more or less as an "exact number of bytes to receive". While looking at their source code, I couldn't identify which part is responsible for that. Maybe httplib's Response or ssl's SSLSocket.
As a workaround I tried padding my lines on the server to a multiple of the chunk-size. But the chunk-size in the requests-library is used to fetch bytes from the compressed response stream. So this won't work until I can pad my lines so that their compressed byte-sequence is a multiple of the chunk-size. But this seems far too hacky.
I've read that Twisted could be used for non-blocking, non-buffered processing of http streams on the client, but I only found code for creating stream responses on the server.
It is not
requests
' fault that youriter_lines()
calls are blocking.The
Response.iter_lines()
method callsResponse.iter_content()
, which callsurllib3
'sHTTPResponse.stream()
, which callsHTTPResponse.read()
.These calls pass along a chunk-size, which is what is passed on to the socket as
self._fp.read(amt)
. This is the problematic call, asself._fp
is a file object produced bysocket.makefile()
(as done by thehttplib
module); and this.read()
call will block untilamt
(compressed) bytes are read.This low-level socket file object does support a
.readline()
call that will work more efficiently, buturllib3
cannot make use of this call when handling compressed data; line terminators are not going to be visible in the compressed stream.Unfortunately,
urllib3
won't callself._fp.readline()
when the response isn't compressed either; the way the calls are structured it'd be hard to pass along you want to read in line-buffering mode instead of in chunk-buffering mode as it is.I must say that HTTP is not the best protocol to use for streaming events; I'd use a different protocol for this. Websockets spring to mind, or a custom protocol for your specific use-case.
Thanks to Martijn Pieters answer I stopped working around python-requests behavior and looked for a completely different approach.
I ended up using pyCurl. You can use it similar to a select+recv loop without inverting the control flow and giving up control to a dedicated IO-loop as in Tornado, etc. This way it is easy to use a generator that yields new lines as soon as they arrive - without further buffering in intermediate layers that could introduce delay or additional threads that run the IO-loop.
At the same time, it is high-level enough, that you don't need to bother about chunked transfer encoding, SSL encryption or gzip compression.
This was my old code, where
chunk_size
=1 resulted in 45% CPU load andchunk_size
>1 introduced additional lag.Here is my new code based on pyCurl: (Unfortunately the curl_easy_* style
perform
blocks completely, which makes it difficult to yield lines in between without using threads. Thus I'm using the curl_multi_* methods)This code tries to fetch as many bytes as possible from the incoming stream, without blocking unnecessarily if there are only a few. In comparison, the CPU load is around 0.2%