I am writing an application similar to what was suggested here. Essentially, I am using Perl to manage the execution of multiple CPU intensive processes in parallel via fork and wait. However, I am running on a 4-core machine, and I have many more processes, all with very dissimilar expected run-times which aren't known a priori.
Ultimately, it would take more effort to estimate the run times and gang them appropriately, than to simply utilize a queue system for each core. Ultimately I want each core to be processing, with as little downtime as possible, until everything is done. Is there a preferred algorithm or mechanism for doing this? I would assume this is a common problem/use so I don't want to re-invent the wheel, as my wheel will probably be inferior to the 'right way. '
As a minor aside, I would prefer to not have to import additional modules (like Parallel::ForkManager) to accomplish this, but if that is the best way to go, then I will consider it.
~Thanks!
EDIT: Fixed 'here' link: Thanks to ikegami
EDIT: P::FM is too easy to use, not to... Today I Learned.
Forks::Super
has some features that are good for this sort of task.fork
andwait
calls, you can still use the features ofForks::Super
without too many changes. That is, your new code will still havefork
andwait
calls.Parallel::ForkManager
, you can control how many jobs you run simultaneously. When one job completes, the module can start another one, keeping your system fully utilized. You can also specify more complex logic like "run at most 6 background jobs on the weekends or between midnight and 6:00 am, but 2 background jobs the rest of the time"timing utilities:
Forks::Super
keeps track of the start time and end time of every job, letting you log and analyze how long each job took:CPU affinity control: I can't tell whether this is something you need, but Guarav seemed to think it mattered. You can assign background jobs to specific cores