GAE Python - How to set a cron job to launch a bac

2019-02-15 13:38发布

I'm running a daily reporting task on GAE which since recently is using too much memory to finish. Therefore I'd like to set it as a backend task. I've set the backend as following:

backends:
- name: reporting
  class: B4_1G
  options: dynamic
  start: reporting.app

In reporting.py there are a number of classes which are defined, which call different reports. My cron.yaml currently looks like this:

cron:
- description: update report 1
  url: /reports/report1
  schedule: every day 03:00
- description: update report 2
  url: /reports/report2
  schedule: every day 03:30

However logically this just calls the job on the frontend instance, through the app.yaml which currently looks like this:

application: appname
version: 1
runtime: python27
api_version: 1
threadsafe: true

handlers:
- url: /(robots\.txt)
  static_files: \1
  upload: (robots\.txt)
- url: /favicon\.ico
  static_files: favicon.ico
  upload: favicon\.ico
- url: /sitemap\.xml
  static_files: sitemap.xml
  upload: sitemap\.xml
- url: /images
  static_dir: images
- url: /js
  static_dir: js
- url: /css
  static_dir: css
- url: /reports/.*
  script: reporting.app
  login: admin

What would I have to change to call these jobs on a backend instance on a daily basis?

2条回答
孤傲高冷的网名
2楼-- · 2019-02-15 14:00

An easier way to do this is by migrating the app to modules. Explained here: https://developers.google.com/appengine/docs/python/modules/

After doing so, you can just add following line in the cron.yaml:

target: yourmodule

This allows the cron job to run on the instance defined in yourmodule.yaml

查看更多
不美不萌又怎样
3楼-- · 2019-02-15 14:19

Depends if you want a persistent or dynamic backend

For a dynamic one

The plan is:

  1. A cron fires at specific time.

  2. Adds a task on a queue that will start the backend

  3. The backend starts

Example:

app.yaml:

- url: /crons/startgooglepluscrawler/
  script: crons.startgooglepluscrawler.app
  login: admin

backends.yaml:

backends: 
- name: google-plus-crawler
  class: B2
  start: backends.googlepluscrawler.app
  options: dynamic, failfast
  instances: 1

crons.yaml:

cron:
- description: get daily google plus user followers and followings
  url: /crons/startgooglepluscrawler/
  schedule: every day 09:00

queue.yaml:

total_storage_limit: 10M
queue:
- name: google-plus-daily-crawling
  rate: 1/s
  retry_parameters:
    task_retry_limit: 0
    task_age_limit: 1s

On the startgooglepluscrawler.app you need to start the backend with a taskqueue:

class StartGooglePlusCrawlerHandler(webapp2.RequestHandler):

    def get(self):
        logging.info("Running daily Cron")
        taskqueue.add(queue_name = "google-plus-daily-crawling",
                    url="/_ah/start",
                    method='GET',
                    target=(None if self.is_dev_server() else 'google-plus-crawler'),
                    headers={"X-AppEngine-FailFast":"true"}
                    )
        logging.info("Daily Cron finished")

    def is_dev_server(self):
        return os.environ['SERVER_SOFTWARE'].startswith('Dev')


app = webapp2.WSGIApplication([
        ("/crons/startgooglepluscrawler/",StartGooglePlusCrawlerHandler)

    ],debug=True)

And at the backends/googlepluscrawler.py just normally like an app, and a handler to /_ah/start:

app = webapp2.WSGIApplication(
            [('/_ah/start', StartHandler)],
            debug=True,
            config=config.config)

The above example will fire up the backend instance.

查看更多
登录 后发表回答