See also Having a PHP script loop forever doing computing jobs from a queue system, but that doesn't answer all my questions.
If I want to run a PHP script forever, accessing a queue and doing jobs:
What is the potential for memory problems? How to avoid them? (any flush functions or something I should use?)
What if the script dies for some reason? What would be a good method to automatically start it up again?
What would be the best basic approach to start the script. Since it runs forever, I don't need cron. But how do I start it up? (See also 2.)
Set the queue up as a cron script. Have it execute every 10 seconds. When the script fires up, check if there's a lock file present (something like .lock). If there is, exit immediately. If not, create the .lock and start processing. If any errors occur, email/log these errors, delete .lock and exit. If there's no tasks, then exit.
I think this approach is ideal, since PHP isn't really designed to be able to run a script for extended periods of time like you're asking. To avoid potential memory leaks, crashes etc, continuously executing the script is a better approach.
While PHP can access (publish and consume) MQ's, if at all possible try to use a fully functional MQ application to do this.
A fully functional MQ application (in ruby, perl, .NET, java etc) will handle all of the concurrency, error logging, state management and scalability issues that you discuss.
Not going too far with state machines, at least it's a good idea to introduce states both to 'jobs' (example: flv2avi conversion) and 'tasks' (flv2avi 1.flv).
On my script (Perl), sometimes zombie processes are starting to downgrade the whole script's performance. It is a rare case, but it is native in source, so the script should be able to stop reading queue anymore, allowing new instance to continue its tasks&jobs; however, keeping as much of running tasks' data is welcome. Once first instance has 1-2 tasks, it gets killed.
On start :
check for common errors (due to shutdown)
check for known errors (out of space, can't read input)
kill whatever may be killed and set status to 'waiting'
start all waiting.
If you run a piped jobs (vlc | ffmpeg, tail -f | grep), you can try to avoid using too much I/O in your program, instead doing fork() (bad idea for PHP?) or just calling /bin/bash -c "prog1 | prog2", this saves a lot of cpu load.
Start points: both /etc/rc.d and cron (check processes, run first instance || run second with 'debug' argument )