DeadlineExceededError: The overall deadline for re

2019-06-09 12:59发布

问题:

I have a cron job which calls vendor api to fetch the companies list. Once the data is fetched, we are storing that data into cloud datastore as shown in the below code . For some reason for last two days when i trigger the cron job , started seeing the errors. When i debug the code locally i dont see this error

    company_list = cron.rest_client.load(config, "companies", '')

    if not company_list:
        logging.info("Company list is empty")
        return "Ok"

    for row in company_list:
        company_repository.save(row,original_data_source, 
                                 actual_data_source)

Repository code

  def save( dto, org_ds , act_dp):
   try:
    key = 'FIN/%s' % (dto['ticker'])
    company = CompanyInfo(id=key)
    company.stock_code = key
    company.ticker = dto['ticker']
    company.name = dto['name']
    company.original_data_source = org_ds
    company.actual_data_provider = act_dp
    company.put()
    return company
  except Exception:
    logging.exception("company_repository: error occurred saving the 
                       company record ")
    raise

Error

  DeadlineExceededError: The overall deadline for responding to the 
                          HTTP request was exceeded.

Exception details

  Traceback (most recent call last):
  File   

"/base/data/home/runtimes/python27_experiment/python27_lib/versions/1/googl
    e/appengine/runtime/wsgi.py", line 267, in Handle
    result = handler(dict(self._environ), self._StartResponse)
   File "/base/data/home/apps/p~svasti-173418/internal-
  api:20170808t160537.403249868819304873/lib/flask/app.py", line 1836, in __call__
    return self.wsgi_app(environ, start_response)
  File "/base/data/home/apps/p~svasti-173418/internal-
   api:20170808t160537.403249868819304873/lib/flask/app.py", line 1817, in 
    wsgi_app
      response = self.full_dispatch_request()
    File "/base/data/home/apps/p~svasti-173418/internal-
   api:20170808t160537.403249868819304873/lib/flask/app.py", line 1475, in full_dispatch_request
    rv = self.dispatch_request()
  File "/base/data/home/apps/p~svasti-173418/internal-api:20170808t160537.403249868819304873/lib/flask/app.py", line 1461, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/base/data/home/apps/p~svasti-173418/internal-api:20170808t160537.403249868819304873/internal/cron/company_list.py", line 21, in run
    company_repository.save(row,original_data_source, actual_data_source)
  File "/base/data/home/apps/p~svasti-173418/internal-api:20170808t160537.403249868819304873/internal/repository/company_repository.py", line 13, in save
    company.put()
  File "/base/data/home/runtimes/python27_experiment/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 3458, in _put
    return self._put_async(**ctx_options).get_result()
  File "/base/data/home/runtimes/python27_experiment/python27_lib/versions/1/google/appengine/ext/ndb/tasklets.py", line 383, in get_result
    self.check_success()
  File "/base/data/home/runtimes/python27_experiment/python27_lib/versions/1/google/appengine/ext/ndb/tasklets.py", line 378, in check_success
    self.wait()
  File "/base/data/home/runtimes/python27_experiment/python27_lib/versions/1/google/appengine/ext/ndb/tasklets.py", line 362, in wait
    if not ev.run1():
  File "/base/data/home/runtimes/python27_experiment/python27_lib/versions/1/google/appengine/ext/ndb/eventloop.py", line 268, in run1
    delay = self.run0()
  File "/base/data/home/runtimes/python27_experiment/python27_lib/versions/1/google/appengine/ext/ndb/eventloop.py", line 248, in run0
    _logging_debug('rpc: %s.%s', rpc.service, rpc.method)
  File "/base/data/home/runtimes/python27_experiment/python27_lib/versions/1/google/appengine/api/apiproxy_stub_map.py", line 453, in service
    @property
DeadlineExceededError: The overall deadline for responding to the HTTP request was exceeded.

回答1:

Has your company list been getting bigger?

How many entities are you trying to put?

Try saving them as a batch, instead of sequentially in a loop. Remove company.put() from def save( dto, org_ds , act_dp): and use ndb.put_multi() afterwards instead.

company_list = cron.rest_client.load(config, "companies", '')

if not company_list:
    logging.info("Company list is empty")
    return "Ok"

company_objs=[]
for row in company_list:
    company_objs.append(company_repository.save(row,original_data_source, 
                             actual_data_source))
    # put 500 at a time
    if len(company_objs) > 500:
        ndb.put_multi(company_objs)
        company_objs=[]
# put any remainders
if len(company_objs) > 0:
    ndb.put_multi(company_objs)


回答2:

My answer is based on one that Alex gave, but runs async.

I've replaced put_multi() with put_multi_async()

By replacing the call to put_multi() with a call to its async equivalent put_multi_async(), the application can do other things right away instead of blocking on put_multi().

And added @ndb.toplevel decorator

This decorator tells the handler not to exit until its asynchronous requests have finished

If your data grows bigger, you may want to look deeper into defered library. It can be used to respawn task every X batches, with the rest of your unprocessed data.

@ndb.toplevel
def fetch_companies_list():
    company_list = cron.rest_client.load(config, "companies", '')

    if not company_list:
        logging.info("Company list is empty")
        return "Ok"

    company_objs=[]
    for row in company_list:
        company_objs.append(company_repository.save(row,original_data_source, 
                             actual_data_source))
        # put 500 at a time
        if len(company_objs) >= 500:
            ndb.put_multi_async(company_objs)
            company_objs=[]
    # put any remainders
    if len(company_objs) > 0:
        ndb.put_multi_async(company_objs)