可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I'm trying to make sockets timeout in Ruby via the SO_RCVTIMEO socket option however it seems to have no effect on any recent *nix operating system.

Using Ruby's Timeout module is not an option as it requires spawning and joining threads for each timeout which can become expensive. In applications that require low socket timeouts and which have a high number of threads it essentially kills performance. This has been noted in many places including Stack Overflow.

I've read Mike Perham's excellent post on the subject here and in an effort to reduce the problem to one file of runnable code created a simple example of a TCP server that will receive a request, wait the amount of time sent in the request and then close the connection.

The client creates a socket, sets the receive timeout to be 1 second, and then connects to the server. The client tells the server to close the session after 5 seconds then waits for data.

The client should timeout after one second but instead successfully closes the connection after 5.

#!/usr/bin/env ruby
require 'socket'

def timeout
  sock = Socket.new(Socket::AF_INET, Socket::SOCK_STREAM, 0)

  # Timeout set to 1 second
  timeval = [1, 0].pack("l_2")
  sock.setsockopt Socket::SOL_SOCKET, Socket::SO_RCVTIMEO, timeval

  # Connect and tell the server to wait 5 seconds
  sock.connect(Socket.pack_sockaddr_in(1234, '127.0.0.1'))
  sock.write("5\n")

  # Wait for data to be sent back
  begin
    result = sock.recvfrom(1024)
    puts "session closed"
  rescue Errno::EAGAIN
    puts "timed out!"
  end
end

Thread.new do
  server = TCPServer.new(nil, 1234)
  while (session = server.accept)
    request = session.gets
    sleep request.to_i
    session.close
  end
end

timeout

I've tried doing the same thing with a TCPSocket as well (which connects automatically) and have seen similar code in redis and other projects.

Additionally, I can verify that the option has been set by calling getsockopt like this:

sock.getsockopt(Socket::SOL_SOCKET, Socket::SO_RCVTIMEO).inspect

Does setting this socket option actually work for anyone?

回答1:

You can do this efficiently using select from Ruby's IO class.

IO::select takes 4 parameters. The first three are arrays of sockets to monitor and the last one is a timeout (specified in seconds).

The way select works is that it makes lists of IO objects ready for a given operation by blocking until at least one of them is ready to either be read from, written to, or wants to raise an error.

The first three arguments therefore, correspond to the different types of states to monitor.

Ready for reading
Ready for writing
Has pending exception

The fourth is the timeout you want to set (if any). We are going to take advantage of this parameter.

Select returns an array that contains arrays of IO objects (sockets in this case) which are deemed ready by the operating system for the particular action being monitored.

So the return value of select will look like this:

[
  [sockets ready for reading],
  [sockets ready for writing],
  [sockets raising errors]
]

However, select returns nil if the optional timeout value is given and no IO object is ready within timeout seconds.

Therefore, if you want to do performant IO timeouts in Ruby and avoid having to use the Timeout module, you can do the following:

Let's build an example where we wait timeout seconds for a read on socket:

ready = IO.select([socket], nil, nil, timeout)

if ready
  # do the read
else
  # raise something that indicates a timeout
end

This has the benefit of not spinning up a new thread for each timeout (as in the Timeout module) and will make multi-threaded applications with many timeouts much faster in Ruby.

回答2:

I think you're basically out of luck. When I run your example with strace (only using an external server to keep the output clean), it's easy to check that setsockopt is indeed getting called:

$ strace -f ruby foo.rb 2>&1 | grep setsockopt
[pid  5833] setsockopt(5, SOL_SOCKET, SO_RCVTIMEO, "\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0

strace also shows what's blocking the program. This is the line I see on the screen before the server times out:

[pid  5958] ppoll([{fd=5, events=POLLIN}], 1, NULL, NULL, 8

That means that the program is blocking on this call to ppoll, not on a call to recvfrom. The man page that lists socket options (socket(7)) states that:

Timeouts have no effect for select(2), poll(2), epoll_wait(2), etc.

So the timeout is being set but has no effect. I hope I'm wrong here, but it seems there's no way to change this behavior in Ruby. I took a quick look at the implementation and didn't find an obvious way out. Again, I hope I'm wrong -- this seems to be something basic, how come it's not there?

One (very ugly) workaround is by using dl to call read or recvfrom directly. Those calls are affected by the timeout you set. For example:

require 'socket'
require 'dl'
require 'dl/import'

module LibC
  extend DL::Importer
  dlload 'libc.so.6'
  extern 'long read(int, void *, long)'
end

sock = Socket.new(Socket::AF_INET, Socket::SOCK_STREAM, 0)
timeval = [3, 0].pack("l_l_")
sock.setsockopt Socket::SOL_SOCKET, Socket::SO_RCVTIMEO, timeval
sock.connect( Socket.pack_sockaddr_in(1234, '127.0.0.1'))

buf = "\0" * 1024
count = LibC.read(sock.fileno, buf, 1024)
if count == -1
  puts 'Timeout'
end

This code works here. Of course: it's an ugly solution, which won't work on many platforms, etc. It may be a way out though.

Also please notice that this is the first time I do something similar in Ruby, so I'm not aware of all the pitfalls I may be overlooking -- in particular, I'm suspect of the types I specified in 'long read(int, void *, long)' and of the way I'm passing a buffer to read.

回答3:

Based on my testing, and Jesse Storimer's excellent ebook on "Working with TCP Sockets" (in Ruby), the timeout socket options do not work in Ruby 1.9 (and, I presume 2.0 and 2.1). Jesse says:

Your operating system also offers native socket timeouts that can be set via the SNDTIMEO and RCVTIMEO socket options. But, as of Ruby 1.9, this feature is no longer functional."

Wow. I think the moral of the story is to forget about these options and use IO.select or Tony Arcieri's NIO library.