Robustly call a flaky API: proper error handling w

2020-06-03 03:59发布

问题:

I hacked this together as a seemingly robust way to call a flaky webservice that was giving timeouts and the occasional name resolution or socket error or whatever. I thought I'd put it here in case it's useful or, more likely, to be told a better way to do this.

require 'net/http'

retries = 5
begin
  url = URI.parse('http://api.flakywebservice.com')
  http = Net::HTTP.new(url.host, url.port)
  http.read_timeout = 600  # be very patient
  res = nil
  http.start{|http|
    req = Net::HTTP::Post.new(url.path)
    req.set_form_data(params)  # send a hash of the POST parameters
    res = http.request(req)
  }
rescue Exception   # should really list all the possible http exceptions
  sleep 3
  retry if (retries -= 1) > 0
end

# finally, do something with res.body, like JSON.parse(res.body)

The heart of this question is: What all exceptions should I be looking for when making a call to a webservice like this? Here's an attempt to collect them all, but it seems like there's got to be a better way than that: http://tammersaleh.com/posts/rescuing-net-http-exceptions

回答1:

Exceptions are meaningful, and Net::HTTP offers specific exceptions for different sorts of cases. So if you want to handle them each in a particular way, you can.

That article says that handling those specific exceptions is better/safer than handling rescue Exception, and that's very true. BUT, rescue Exception is different from rescue by itself, which is the equivalent to rescue StandardError, which is what you should usually do by default if you don't have a reason to do anything else.

Rescuing top-level Exception will rescue anything that could possibly happen in the entire execution stack, including some part of ruby running out of disk or memory or having some obscure system-related IO problem.

So, as far as "what to rescue", you're generally better off if you change your code to rescue. You'll catch everything you want to, and nothing that you don't want to. However, in this particular case, there is one lone exception in that guy's list that is NOT a descendent of StandardError:

def parents(obj)
  ( (obj.superclass ? parents(obj.superclass) : []) << obj)
end

[Timeout::Error, Errno::EINVAL, Errno::ECONNRESET, EOFError, Net::HTTPBadResponse,
  Net::HTTPHeaderSyntaxError, Net::ProtocolError].inject([]) do |a,c|
  parents(c).include?(StandardError) ? a : a << c
end
# Timeout::Error < Interrupt

parents(Timeout::Error)
# [ Object, Exception < Object, SignalException < Exception,
#   Interrupt < SignalException, Timeout::Error < Interrupt ]

So you could change your code to rescue StandardError, Timeout::Error => e and you'll cover all the cases mentioned in that article, and more, but not the stuff that you don't want to cover. (the => e is not required, but more on that below).

Now, as far as your actual technique for dealing with the flakey API -- the question is, what's the problem with the API that you are dealing with? Badly formatted responses? No responses? Is the problem at the HTTP level or in the data you are getting back?

Maybe you don't yet know, or you don't yet care, but you know that retrying tends to get the job done. In that case, I would at least recommend logging the exceptions. Hoptoad has a free plan, and has some sort of thing like Hoptoad.notify(e) -- I can't remember if that's the exact invocation. Or you can email it or log it, using e.message and e.stacktrace.



回答2:

When attempting to handle error conditions, be aware that by default Net::HTTP will AUTOMATICALLY RETRY for some HTTP verbs. The source for net/http.rb#transport_request (Ruby v2.5.0) includes:

    if count < max_retries && IDEMPOTENT_METHODS_.include?(req.method)
      count += 1
      @socket.close if @socket
      D "Conn close because of error #{exception}, and retry"
      retry
    end

so unless precautions are taken, Net::HTTP code will call the service 2x for all HTTP verbs included in IDEMPOTENT_METHODS_ before returning an error.

For Ruby < 2.5, the best thing I've come up with in a Rails app is to add something like:

Net::HTTP::IDEMPOTENT_METHODS_.delete_if { true }

This works around the problem by making IDEMPOTENT_METHODS_ empty so the #include? on req.method will always fail.

For Ruby >= 2.5, Net::HTTP has a #max_retries= method that should be set to 0 (unless of course retries are desired).

More background and history may be found here: Ruby Users: Be Wary of Net::HTTP