I'm planning to deploy an app using Play and have never used their "Jobs" before. My deploy will be large enough to require different Play servers load-balanced, but my computations won't be large enough to need hadoop/storm/others.
My question is, how do I handle this scenario in Play? If I set up a job in Play to run every minute, I don't want every single server to do the exact same thing at the same time.
I could only find this answer but I don't like any of those options.
So, is there any tools or best practices to coordinate jobs or do I have to do something from scratch?
You can use a table in your database to store a jobLock but you have to check/update this lock in a separate transactions (you have to use JPA.newEntityManager for this)
My JobLock class uses a LockMode enum
here is the JobLock class
and here is my LockAwareJob that uses this lock
In my log4j.properties I add a special line to avoid having an error each time an instance failed acquiring the job lock
With this solution you can also use the JobLock id to store parameters associated with this job (last run date for example)
I would personally use one instance running jobs only for simplicity. Alternatively you could look at using Akka instead of Jobs if you want finer control over execution and better concurrency, parallel handling.
You can use a database flag as described here: Playframework concurrent jobs management by Pere Villega for two jobs.
But I think the solution from Guillaume Bort on Google Groups to use Memcache is the best one. There seems to be a module for Play 2: https://github.com/mumoshu/play2-memcached