Greetings, I am developing a web app. One piece of it will allow users to schedule a "reminder" email to be sent to them at a particular time of day. What is the best way to accomplish this? Basically, all the solutions I've come up with operate on a "polling" pattern when what I want is an "interrupt" pattern.
Here are some possible solutions I've come up with:
Have a cronjob fire every minute. The script that fires checks a database to see if there are any emails to send, if there are, it sends them, else it goes back to sleep. Drawback with this is that there is a bit of overhead incurred every minute. Also, this may not be a scalable system, especially when the number of users gets so large that it may take over a minute to send out all the emails.
Same as #1, but job only fires every 15 minutes. This is a bit more manageable, but not perfect, as it restricts the users to reminders on the 15 minute marks, and it still incurs a bit of overhead when there are no emails to send. Not bad, but not perfect either.
Have PHP exec() a bit of code that dynamically alters crontab or schedules an "at" job in the underlying linux. This would give me the flexibility and "interrupt" type model I so crave, but would open up a huge security hole in allowing PHP to exec() linux code. So, I'm going to go ahead and rule this one out.
So, anything better than what I've come up with? Perhaps a way to schedule email without using cron? I'm very curious to see what you guys have to say about this :).
Use first variant.
it may take over a minute to send out
all the emails
- Check, if file_exists('mailing.q'); If still exists - terminate execution.
- Create file mailing.q
- send emails
- unlink('mailing.q');
And don't think about overhead - not in this case.
You can have a PHP script that remains running. Every set interval, query the database for emails that need to be sent in the next interval. Break that down into an array with one group for every minute. So if you choose 15 minutes, you would have an array with 15 entries, each entry having all emails that need to be sent out at that time.
You can then use forking to split the process, one handles sending the emails, the other sleeps until the next minute and splits again. To scale, you could fork multiple processes with each process handling a certain number of emails.
In a nutshell, one process manages the queue and forks other processes to handle the sending. When the queue is "empty" it gets more. You could have a cron running periodically to make sure the process hasn't died.
There is nothing particularly wrong with using cron in the options #1 and #2, I don't know what kind of app you are using, but giving users the ability to schedule to the exact minute may not be necessary. Even then, it probably wouldn't be a problem if your script marks the status of reminder as "pending" or such and any new instances of the script only send ones that aren't "pending" or "sent".
You could use Hudson or a similar app which could help with script management and would enable you to keep an eye on failures, etc. It can even send out notices when there are failures. It supports it's own java-based cron system.
If the app does get large, you certainly might want to offload this process to a separate server from your web server. You also may want to look into 3rd-party tools for sending mail, if you are not already using an external SMTP service, and see what integration tools they might have. This should also improve delivery rates, etc.
Don't confuse queuing a mail to send and actually sending the mail.
Your mail server might need fifteen minutes to send a single email. But it takes my mail(1)
only 0.036s
to queue a mail message for sending.
And even if you do wind up with more than 1600 emails to send per minute (good work!) you could tweak your code a little bit to start sending reminder emails probabilistically several minutes earlier, in anticipation of 'spikes' -- say, looking ahead in your database by five minutes to see if there were >1000 mails to deliver, and start queuing them with 1/5th probability, 1/4th probability, 1/3 probability, 1/2 probability, then queuing the remainders.
There is a command hostman
which allows you to call a function at a specific time. That should do what you want.
This was my original suggestion:
What about a combination?
- Have a cron job run once a <timespan>.
- Have it populate a file with a timestamp => list reflecting whether there is an email before <current time> + <timespan>.
- Have a second cron run every minute, if the time is found in the email list file, then run the email sending script.