Exception handling for parallel fetch requests

2019-04-15 20:44发布

问题:

I have the following code:

  try:
     responses = yield [httpClient.fetch(url) for url in urls]
  except (HTTPError, IOError, ValueError) as e:
     print("caught")

I can't guarantee the urls given are valid. I want to be able to use the exception to validate the urls. How can I tell which url(s) fail within the caught exception?

Also if one fetch fails (say the first) it looks like it breaks for the rest of the fetches? Is there a way to prevent this? Or is there a better way to check the URL can be fetched before actually fetching? Is there a better pattern for this. Basically I want to fetch all the URLs in parallel and know which one potentially fails.

回答1:

The simplest solution is to pass raise_error=False to fetch(). This will always give you a response, and you will be able to inspect response.error or use response.rethrow():

responses = yield [httpClient.fetch(url, raise_error=False) for url in urls]
for url, resp in zip(urls, responses):
    try:
        resp.rethrow()
        print("succeeded")
    except (HTTPError, IOError, ValueError) as e:
        print("caught")


回答2:

I thought about doing the following:

      @tornado.gen.coroutine
      def wrap(httpClient,url):
         try:
            response=yield httpClient.fetch(url)
         except (HTTPError, IOError, ValueError,StopIteration) as e:
            return e
         return response



      httpClient=AsyncHTTPClient()
      responses=yield [wrap(httpClient,url) for url in urls]

Is there a better or more elegant way? This is also part of a function that is already decorated by @tornado.gen.coroutine will that pose a problem?



回答3:

The approach in your answer is ok, even more elegant, at first glance, than the Tornado's one. WaiterIterator is the "reference" solution to handle such a situations. I've wrapped Tornado's doc example as:

from tornado.ioloop import IOLoop
from tornado.gen import coroutine, WaitIterator
from tornado.httpclient import AsyncHTTPClient, HTTPError

@coroutine
def multi_exc_safe(futures):
    multi = {}
    wait_iterator = WaitIterator(*futures)
    while not wait_iterator.done():
        try:
            res = yield wait_iterator.next()
            multi[wait_iterator.current_index] = res
        except (HTTPError, IOError, ValueError,StopIteration) as e:
            multi[wait_iterator.current_index] = e
    return multi

@coroutine
def main():
    urls = [
        'http://google.com',
        'http://nnaadswqeweqw342.comm',
    ]
    httpclient = AsyncHTTPClient()
    responses = yield multi_exc_safe([httpclient.fetch(url) for url in urls])
    print(responses)

IOLoop.instance().run_sync(main)

The cool thing (probably not relevant in your problem) about WaiterIterator is, it's a iterator:). It allows to get responses as soon as possible and works like charm with async for.



标签: tornado