I am writing a php cron job that reads thousands of feeds / web pages using curl and stores the content in a database. How do I restrict the number of threads to, lets say, 6? i.e., even though I need to scan thousands of feeds / web pages, I want only 6 curl threads active at any time so that my server and network don't get bogged down. I could do it easily in Java using wait, notify, notifyall methods of Object. Should I build my own semaphore or does php provide any built-in functions?
相关问题
- Views base64 encoded blob in HTML with PHP
- Laravel Option Select - Default Issue
- PHP Recursively File Folder Scan Sorted by Modific
- Can php detect if javascript is on or not?
- Using similar_text and strpos together
First of all, PHP doesn't have threads, but it does have process control: http://php.net/manual/en/book.pcntl.php
I've built a class around these functions to help with my multi-process requirements.
I'm in a similar situation. I'm keeping a log of the processes that get started from cron and their state. I'm checking on them from a related cron job.
EDIT (more details):
In my project I log all the key changes to the database. Actions may then be taken if the changes meet the actions criterion. So what I'm doing is different to you. However, there are some similarities.
When I fork a new process, I enter it's pid in a DB table. Then next time the cron job kicks in, part of what it does is check to see if the processes have completed properly, and then mark the action as completed in that DB table.
You don't give many details about your project. So I will just throw out a suggestion:
Depending on the size of your project, this may seem like over kill. However, I've thought about it for a long long time, and I want to keep track of all those forked processes. Forking can be risky business, and can lead to system resource overload - speaking from experience ;)
I'd be interested to hear other techniques as well.
From my reply at PHP using proc_open so that it doesn't wait for the script it opens (runs) to finish?
Some of my code when i played around with proc_open
I had issues with proc_close (10 to 30 seconds) so i just killed the process using linux command kill
Curl sometimes freezez for me on various servers (ubuntu, centos) but not on all of them, so i kill any "child" processes that take over 40 seconds because normally the script would take 10 second at maximum and i'd rather redo the work than wait a minute or so for curl to un-freeze.
And create a file named 'check_working_child.php' to do all the work, the first parameter will be the instance number and the second the time limit
php check_working_child.php 5 60
means you are the 5th child and are allowed to run 60 secondsIf the above code does not run let me know, i will post it using pastebin or something...