My problem is similar to python - How select.select() works? . However, the solution there doesn't work for me, because I'm not open()ing my file. Instead, it's a socket. I couldn't find any way to set it to be unbuffered in the documentation.
I have a glib mainloop (which uses select), where I registered the socket for reading. Because socket.recv() requires me to specify a receive buffer size, it is not unusual to read fewer bytes than the socket read. As long as the kernel buffers them, that is fine; select will still mark the socket as "ready for reading". But apparently Python has a buffer as well. With large files, near the end of the data stream, recv() will read a part of it, the rest will be buffered by Python and select no longer triggers on my socket, until new data is sent. At that point, the "missing" data is received before the new data; no data is lost.
My question is: how do I solve this? Is there a way to disable Python's buffer on the socket? If not, is there a way to check if the buffer is empty, so I can make sure I don't return from my callback until it is?
Edit:
As noted in the comment, Python doesn't add an extra buffer to sockets, so this could not be the problem. I was unable to create a minimal example for the problem. However, it seems that it may be related to using ssl sockets. I had forgotten that I used an encrypted connection; disabling the encryption seems to solve this issue, but is not acceptable to me. So the above question remains, with the note that the buffers are probably implemented in the ssl module.
Example code to show the problem:
#!/usr/bin/python
import glib
import socket
import ssl
def cb (fd, cond):
print ('data: %s' % repr (s.read (1)))
return True
s = ssl.wrap_socket (socket.create_connection (('localhost', 1234)))
glib.io_add_watch (s.fileno (), glib.IO_IN, cb)
glib.MainLoop ().run ()
Then run a server with
openssl s_server -accept 1234 -key file.key -cert file.crt
Running the python program will establish the connection. Sending more than one byte of data will make the program print only the first byte; when sending more bytes, the remaining chunks are read first, then the first new byte, then it waits again. This is easy to understand: as long as there is data in the ssl buffer, the new byte is not read from the kernel buffer, so select continues to report it.