APScheduler job is not starting as scheduled

2019-07-25 15:59发布

问题:

I'm trying to schedule a job to start every minute. I have the scheduler defined in a scheduler.py script:

from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.executors.pool import ThreadPoolExecutor, ProcessPoolExecutor


executors = {
    'default': ThreadPoolExecutor(10),
    'processpool': ProcessPoolExecutor(5)
}
job_defaults = {
    'coalesce': False,
    'max_instances': 5
}

scheduler = BackgroundScheduler(executors=executors,job_defaults=job_defaults)

I initialize the scheduler in the __init__.py of the module like this:

from scheduler import scheduler

scheduler.start()

I want to start a scheduled job on a specific action, like this:

def AddJob():
    dbid = repository.database.GetDbid()
    job_id = 'CollectData_{0}'.format(dbid)
    scheduler.scheduled_job(func=TestScheduler(),
                            trigger='interval',
                            minutes=1,
                            id=job_id
                            )

def TestScheduler():
    for i in range(0,29):
        starttime = time()
        print "test"
        sleep(1.0 - ((time() - starttime) % 1.0))

First: when I'm executing the AddJob() function in the python console it starts to run as expected but not in the background, the console is blocked until the TestScheduler function ends after 30 seconds. I was expecting it to run in the background because it's a background scheduler.
Second: the job never starts again even when specifying a repeat interval of 1 minute.

What am I missing?

UPDATE

I found the issue thanks to another thread. The wrong line is this:

scheduler.scheduled_job(func=TestScheduler(),
                            trigger='interval',
                            minutes=1,
                            id=job_id
                            )

I changed it to:

scheduler.add_job(func=TestScheduler,
                            trigger='interval',
                            minutes=1,
                            id=job_id
                            )

TestScheduler() becomes TestScheduler. Using TestScheduler() cause the result of the function TestScheduler() to be passed as an argument of the add_job().

回答1:

The first problem seems to be that you are initializing the scheduler inside the __init__.py, which doesn't seem to be the recommended way.
Code that exists in the __init__.py gets executed the first time a module from the specific folder gets imported. For example, imagine this structure:

my_module
|--__init__.py
|--test.py

with __init__.py:

from scheduler import scheduler

scheduler.start()

the scheduler.start() command gets executed when from my_module import something. So it either doesn't start at all from __init__.py or it starts many times (depending on the rest of your code!).

Another problem must be the use of scheduler.scheduled_job() method. If you read the documentation on adding jobs, you will observe that the recomended way is to use the add_job() method and not the scheduled_job() which is a decorator for convenience.

I would suggest something like this:

  1. Keep my_scheduler.py as is.
  2. Remove the scheduler.start() line from __init__.py.
  3. Change your main file as follows:

    from my_scheduler import scheduler
    
    if not scheduler.running: # Clause suggested by @CyrilleMODIANO
        scheduler.start()
    
    def AddJob():
        dbid = repository.database.GetDbid()
        job_id = 'CollectData_{0}'.format(dbid)
        scheduler.add_job(
            func=TestScheduler,
            trigger='interval',
            minutes=1,
            id=job_id
        )
    
    ...