Scalable, Delayed PHP Processing

2019-01-16 15:03发布

I'm working on an online PHP application that has a need for delayed PHP event. Basically I need to be able to execute arbitrary PHP code x many seconds (but it could be days) after the initial hit to a URL. I need fairly precise execution of these PHP event, also I want it to be fairly scalable. I'm trying to avoid the need to schedule a cron job to run every second. I was looking into Gearman, but it doesn't seem to provide any ability to schedule events and as I understand, PHP isn't really meant to run as a daemon.

It would be ideal if I could tell some external process to poll a "event checker" url on PHP server at the exact time that the next event should be run. This poll time will need to be able to decreased or increased at will since event can be removed and added to the queue and. Any ideas on an elegant way to accomplish this? There is simply to much overhead in calling PHP externally (having to parse HTTP request or calling via CLI) to make this idea feasible for my needs.

My current plan is write a PHP daemon that will run the event and interface with it from the PHP server with gearman. The PHP daemon would be build around SplMinHeap so hopefully the performance wouldn't be to bad. This idea leaves a bad taste in my mouth and I was wondering if anyone had a better idea? Ideas changed slightly. Read Edit 2.

EDIT:

I'm creating an online game that evolves players taking turns with variable time limit. I'm using XMPP and BOSH to allow me to push messages to and from my clients, but I've got that part all done and working. Now I'm trying to add an arbitrary event that triggers after play from the client to let the client (and other ppl in the game) that he took to long. I can't use timed trigger on the client side because that would be exploitable (since the client can play by themselves). Hope that helps.

EDIT 2:

Thank you all for your feedback. While I think most of your ideas would work well on small scale, I have a feeling they wouldn't scale very well (external event manager) or lack the exactness this project requires (CRON). Also, in both of those cases they are external pieces which could fail and add complexity to an already complex system.

I personally feel that the only clean solution that meets the requirements for this project is to write a PHP daemon that handles the delayed events. I've begun writing what I think is the first PHP runloop. It handles watching the sockets and executing delayed PHP events. Hopefully when I'm closer to being done with this project I can post up the source, if any of you are interested in it. So far in testing it has shown to be promising solution (no problems with memory leaking or instability).

EDIT 3: Here is a link to the PHP event loop library called LooPHP for those who are interested.

TL;DR Requirements

  • Call (preferably natively) PHP at a delayed time (ranging from seconds to days)
  • Handle creation/updating/deletion of events arbitrarily (I'm expecting a high amount of canceled call).
  • Handle high load of events scheduled (100-1000 a second per server)
  • Calls should be within one second of it's scheduled time
  • At this point i'm not open to rewriting the code base into another language (maybe some day I will)

15条回答
Explosion°爆炸
2楼-- · 2019-01-16 15:22

Here's the correct answer, but you may not like it.

PHP is designed entirely around being used as a request-response (http) language, and thus doesn't support what you are looking for - it's great to hack and find ways around, but it will be just that, a hack, whatever 'solution' you end up getting.

What you really need is an event driven language that supports xmpp, and for that you need look no further than node.js /v8 and the supporting XMPP libraries - this natively supports and is designed for just what you need. you could also go down the Java route, but if you want to port quickly and get a whole host of new features and support for what you are doing, node is the one.

If you insist on going with PHP (as I have many times over many years) the 'lightest' and most effective way to do this is a persistent PHP deamon with an event Queue in a database - sadly!

查看更多
smile是对你的礼貌
3楼-- · 2019-01-16 15:23

A can't think of anything that does everything you asked for:

  • has to be very precise
  • delay for long periods of time
  • ability to remove/change the time of the event

The trivial way would be to use a combination of the following functions:

set_time_limit(0);
ignore_user_abort(true);
time_sleep_until(strtotime('next Friday'));
// execute code

However, like @deceze said it's probably not a very good idea since if you set up a high delay Apache could eventually kill the child process (unless you're using PHP CLI, that would make it easier). It also doesn't allow you to change / delete the event unless you set up a more complex logic and a database to hold the events. Also, register_shutdown_function() might be useful if you want to go this road.

A better approach would be to set up a CRON job in my opinion.

查看更多
手持菜刀,她持情操
4楼-- · 2019-01-16 15:25

checkout this with redis . may be useful to your problem

https://github.com/chrisboulton/php-resque-scheduler

查看更多
趁早两清
5楼-- · 2019-01-16 15:33

I think a PHP only solution will be hard(almost impossible) to implement. I came up with two solutions to your problem.

PHP/Redis solution

Question asked by Kendall:

  • How stable is redis:

Redis is very stable. The developer really writes some clean C code. You should check it out on github ;). Also a lot of big sites are using redis. For example github.They had a really interesting blog post how they made github fast :). Also superfeedr uses redis. There are a lot more big companies which are using redis ;). I would advise you to google for it ;).

  • How PHP-friendly is redis:

PHP is very PHP friendly. A lot of users are writing PHP libraries for redis. The protocol is really simple. You can debug it with telnet ;). Looking quickly predis for example has the blocking pop implemented.

  • how would i remove events:

I think you should use something like ZRemCommand.

Redis is an advanced key-value store. It is similar to memcached but the dataset is not volatile, and values can be strings, exactly like in memcached, but also lists, sets, and ordered sets. All this data types can be manipulated with atomic operations to push/pop elements, add/remove elements, perform server side union, intersection, difference between sets, and so forth. Redis supports different kind of sorting abilities.

What I came up with(Pseudo-code....):

processor.php:

<?php
######----processer.php
######You should do something like nohup php processor.php enough times for processors to run event. 
#$key: should be unique, but should also be used by wakeup.php
while(true) {
    $event = blpop($key); #One of the available blocking threads will wakeup and process event
    process($event); #You should write process. This could take some time so this process could not be available
    zrem($key1, $event); #Remove event after processing it. Added this later!!!!!!
}

client.php:

######----client.php
######The user/browser I guess should generate these events.
#$key1: should be unique.
#$millis: when event should run
#$event: just the event to work on.

if ("add event") {
  zadd($key1, $millis, $event);
} else if ("delete event") {
  zremove($key1, $event)
}

#Get event which has to be scheduled first
$first = zrange($key1, 0, 0);

if ($oldfirst <> $first) { #got different first event => notify wakeup.php.
    lpush($key2, $first);
}

$oldfirst = $first;

wakeup.php:

####wakeup.php
#### 1 time do something like nohup php wakeup.php
#http://code.google.com/p/redis/wiki/IntroductionToRedisDataTypes => read sorted set part.
while(true) {
    $first = zrange($key1, 0, 0);
    $event = blpop($key2, $timeoutTillFirstEvent);

    if ($event == nill) {
        #Blockingqueue has timedout which means event should be run by 1 of blocking threads.
        blpop($key2, $first);
    }    
}

Something along the lines of this you could also write a pretty efficient scheduler using PHP(Okay redis is C so kickass fast :)) only and it would be pretty efficient as well :). I would also like to code this solution so stayed tuned ;). I think I could write a usable prototype in a day....

My java solution

This morning I think I created a java program which you can use for your problem.

  1. download:

    Visit github's download page to download the jar file(with all dependencies included).

  2. install:

    java -jar schedule-broadcaster-1.0-SNAPSHOT-jar-with-dependencies-1277709762.jar

  3. Run simple PHP snippets

    1. First php -f scheduler.php
    2. Next php -f receiver.php
  4. Questions

    I created these little snippets so that hopefully you will understand how to use my program. There is also a little bit documentation in the WIKI.

App Engine's TaskQueue

A quick solution would be to Use Google's app engine task queue which has a reasonable free quota. After that you have to pay for what you use.

Using this model, App Engine's Task Queue API allows you to specify tasks as HTTP Requests (both the contents of the request as its data, and the target URL of the request as its code reference). Programmatically referring to a bundled HTTP request in this fashion is sometimes called a "web hook."

Importantly, the offline nature of the Task Queue API allows you to specify web hooks ahead of time, without waiting for their actual execution. Thus, an application might create many web hooks at once and then hand them off to App Engine; the system will then process them asynchronously in the background (by 'invoking' the HTTP request). This web hook model enables efficient parallel processing - App Engine may invoke multiple tasks, or web hooks, simultaneously.

To summarize, the Task Queue API allows a developer to execute work in the background, asynchronously, by chunking that work into offline web hooks. The system will invoke those web hooks on the application's behalf, scheduling for optimal performance by possibly executing multiple webhooks in parallel. This model of granular units of work, based on the HTTP standard, allows App Engine to efficiently perform background processing in a way that works with any programming language or web application framework.

查看更多
不美不萌又怎样
6楼-- · 2019-01-16 15:33

I would just use cron to run a PHP file every so often (i.e. 5 minutes). The PHP file would check if there are any events that need to be fired within the next interval, grab the list of interval events, and sleep until the next event. Wake up, fire next event(s) in the list, sleep until the next one, repeat until done.

You could even scale it by forking or launching another php file to actually fire the event. Then you could fire more than one event at the same time.

查看更多
劳资没心,怎么记你
7楼-- · 2019-01-16 15:34

What about using either cron to run a checker, that can the execute stuff from the DB for example.

Or using the "at" linux command to schedule execution of some command?

查看更多
登录 后发表回答