I'm trying to make sockets timeout in Ruby via the SO_RCVTIMEO socket option however it seems to have no effect on any recent *nix operating system.
Using Ruby's Timeout module is not an option as it requires spawning and joining threads for each timeout which can become expensive. In applications that require low socket timeouts and which have a high number of threads it essentially kills performance. This has been noted in many places including Stack Overflow.
I've read Mike Perham's excellent post on the subject here and in an effort to reduce the problem to one file of runnable code created a simple example of a TCP server that will receive a request, wait the amount of time sent in the request and then close the connection.
The client creates a socket, sets the receive timeout to be 1 second, and then connects to the server. The client tells the server to close the session after 5 seconds then waits for data.
The client should timeout after one second but instead successfully closes the connection after 5.
#!/usr/bin/env ruby
require 'socket'
def timeout
sock = Socket.new(Socket::AF_INET, Socket::SOCK_STREAM, 0)
# Timeout set to 1 second
timeval = [1, 0].pack("l_2")
sock.setsockopt Socket::SOL_SOCKET, Socket::SO_RCVTIMEO, timeval
# Connect and tell the server to wait 5 seconds
sock.connect(Socket.pack_sockaddr_in(1234, '127.0.0.1'))
sock.write("5\n")
# Wait for data to be sent back
begin
result = sock.recvfrom(1024)
puts "session closed"
rescue Errno::EAGAIN
puts "timed out!"
end
end
Thread.new do
server = TCPServer.new(nil, 1234)
while (session = server.accept)
request = session.gets
sleep request.to_i
session.close
end
end
timeout
I've tried doing the same thing with a TCPSocket as well (which connects automatically) and have seen similar code in redis and other projects.
Additionally, I can verify that the option has been set by calling getsockopt
like this:
sock.getsockopt(Socket::SOL_SOCKET, Socket::SO_RCVTIMEO).inspect
Does setting this socket option actually work for anyone?
You can do this efficiently using select
from Ruby's IO class.
IO::select
takes 4 parameters. The first three are arrays of sockets to monitor and the last one is a timeout (specified in seconds).
The way select works is that it makes lists of IO objects ready for a given operation by blocking until at least one of them is ready to either be read from, written to, or wants to raise an error.
The first three arguments therefore, correspond to the different types of states to monitor.
- Ready for reading
- Ready for writing
- Has pending exception
The fourth is the timeout you want to set (if any). We are going to take advantage of this parameter.
Select returns an array that contains arrays of IO objects (sockets in this case) which are deemed ready by the operating system for the particular action being monitored.
So the return value of select will look like this:
[
[sockets ready for reading],
[sockets ready for writing],
[sockets raising errors]
]
However, select returns nil
if the optional timeout value is given and no IO object is ready within timeout seconds.
Therefore, if you want to do performant IO timeouts in Ruby and avoid having to use the Timeout module, you can do the following:
Let's build an example where we wait timeout
seconds for a read on socket
:
ready = IO.select([socket], nil, nil, timeout)
if ready
# do the read
else
# raise something that indicates a timeout
end
This has the benefit of not spinning up a new thread for each timeout (as in the Timeout module) and will make multi-threaded applications with many timeouts much faster in Ruby.
I think you're basically out of luck. When I run your example with strace
(only using an external server to keep the output clean), it's easy to check that setsockopt
is indeed getting called:
$ strace -f ruby foo.rb 2>&1 | grep setsockopt
[pid 5833] setsockopt(5, SOL_SOCKET, SO_RCVTIMEO, "\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
strace
also shows what's blocking the program. This is the line I see on the screen before the server times out:
[pid 5958] ppoll([{fd=5, events=POLLIN}], 1, NULL, NULL, 8
That means that the program is blocking on this call to ppoll
, not on a call to recvfrom
. The man page that lists socket options (socket(7)) states that:
Timeouts have no effect for select(2), poll(2), epoll_wait(2), etc.
So the timeout is being set but has no effect. I hope I'm wrong here, but it seems there's no way to change this behavior in Ruby. I took a quick look at the implementation and didn't find an obvious way out. Again, I hope I'm wrong -- this seems to be something basic, how come it's not there?
One (very ugly) workaround is by using dl
to call read
or recvfrom
directly. Those calls are affected by the timeout you set. For example:
require 'socket'
require 'dl'
require 'dl/import'
module LibC
extend DL::Importer
dlload 'libc.so.6'
extern 'long read(int, void *, long)'
end
sock = Socket.new(Socket::AF_INET, Socket::SOCK_STREAM, 0)
timeval = [3, 0].pack("l_l_")
sock.setsockopt Socket::SOL_SOCKET, Socket::SO_RCVTIMEO, timeval
sock.connect( Socket.pack_sockaddr_in(1234, '127.0.0.1'))
buf = "\0" * 1024
count = LibC.read(sock.fileno, buf, 1024)
if count == -1
puts 'Timeout'
end
This code works here. Of course: it's an ugly solution, which won't work on many platforms, etc. It may be a way out though.
Also please notice that this is the first time I do something similar in Ruby, so I'm not aware of all the pitfalls I may be overlooking -- in particular, I'm suspect of the types I specified in 'long read(int, void *, long)'
and of the way I'm passing a buffer to read.
Based on my testing, and Jesse Storimer's excellent ebook on "Working with TCP Sockets" (in Ruby), the timeout socket options do not work in Ruby 1.9 (and, I presume 2.0 and 2.1). Jesse says:
Your operating system also offers native socket timeouts that can be set via the
SNDTIMEO and RCVTIMEO socket options. But, as of Ruby 1.9, this feature is no longer
functional."
Wow. I think the moral of the story is to forget about these options and use IO.select
or Tony Arcieri's NIO library.