Is this the right way to use a messaging queue?

2019-01-23 14:51发布

问题:

I am new to messaging queues, and right now I am using ZeroMQ on my Linux server. I am using PHP to write both the client and the server. This is mainly used for processing push notifications.

I am using the basic REQ-REP Formal-Communication Pattern on single I/O-threaded ZMQContext instances, as they have demonstrated.

Here is the minimised zeromqServer.php code:

include("someFile.php");

$context = new ZMQContext(1);

//  Socket to talk to clients
$responder = new ZMQSocket($context, ZMQ::SOCKET_REP);
$responder->bind("tcp://*:5555");

while (true) {
    $request = $responder->recv();
    printf ("Received request: [%s]\n", $request);

    //  -----------------------------------------------------------------
    //                                    Process push notifications here
    //
    sleep (1); 

    //  -----------------------------------------------------------------
    //                                          Send reply back to client
    $responder->send("Basic Reply");
}

And here is a minimised ZeroMQ client:

$context = new ZMQContext();

//  Socket to talk to server
echo "Connecting to hello world server…\n";
$requester = new ZMQSocket($context, ZMQ::SOCKET_REQ);
$check = $requester->connect("tcp://localhost:5555");

var_dump($check);
$requester->send("json string payload with data required to process push notifications.");

//$reply = $requester->recv();

So, what I do? I run the zeromqServer.php as a background service, using the linux command

nohup php zeromqServer.php &

This runs it as a background process. Now, when the client calls it, it does the required job.

But the problem is that, I need to restart the process every time there is a change in any of the files ( included the ones include-ed in the zeromqServer file ).

And moreover, somehow after 2-3 days, it just stops working. The process does not stop, but it just stops working.

I feel like it must be some socket issue, maybe the socket is not open anymore. At that time I have to restart the zeromqServer.php file process.

Q1: What might be the issue?

Q2: And what is the right way to do this?

回答1:

This is not the right way to use a Messaging Queue.

A1: The issue is your server finally has to block, as client code does not retrieve any answer while the REQ/REP-pattern requires to do so. The zeromqServer.php on the REP side will simply not attempt to recv() another message from an associated client ( on the REQ side of the Formal Communication Patter ) until the client has been physically delivered ( into an internal buffer ) and has recv()-ed the "reply"-message from the zeromqServer.php side.

Earlier versions of ZeroMQ, ver. 2.1 et al, used to have an unlimited, infinite, default limit sizing of a node's internal message queues and memory-management, used for a low-level I/O-thread buffering, before the data was copied into O/S-kernel resources and released from ZeroMQ memory footprint.

The newer versions of ZeroMQ, ver 3.x+, have the so called HWM-s ( a.k.a. High-Water-Mark-s ) by default "just" 1000 messages oustanding "short", after which the respective piece of such ZeroMQ resource starts to block or drop messages.

While a reactive attempt to explicitly increase HWM-settings management looks like a dirty way to solve a principal design error, another ZeroMQ warning is fair on this for further caution to go just in this direction ( ØMQ does not guarantee that the socket will accept as many as ZMQ_SNDHWM messages, and the actual limit may be as much as 60-70% lower depending on the flow of messages on the socket ).

Forgetting or being unable to do this ( Ref.: as the OP code has already demonstrated ):

//$reply = $requester->recv();

means your REQ/REP Formal Communication Pattern "pendulum" becomes irreversibly deadlocked ( forever ).


A2: Basic REQ/REP Formal-Communication-Pattern sounds straight, but has a few dangerous features, the observed blocking is being just one of this. Some additional steps might be taken code-wise, to deploy XREQ/XREP, DEALER/ROUTER and other tools, but the design should be revised ground up, not just SLOC by SLOC, as there seems to be a lot of things to realise, before designing a code. One of the major mistake is to assume a message is sent once a send() method has been ordered. Not the case in ZeroMQ.

Also the code-design should assume an uncertain nature of a message deliver and handle properly both the missing message problem and any kind of a distributed-service blocking incident ( deadlock, livelock, buffer-threshold overflow, old/new-API conflicts as there is no explicit warranty any of your messaging peers ( there is no central message-broker in ZeroMQ ) has the very same version of ZeroMQ API / protocol implemented on it's localhost side -- so there is indeed a lot of new points of view, during the code design )


The best way to do this

If you can trust and believe in a piece of a hands-on experience, your best next step ought be to download and read the fabulous Pieter HINTJENS' book "Code Connected, Volume 1", where Pieter has a lot of insights on distributed processing, including many hints and directions for reliable-patterns, as you will like to implement.

Do read the book, it is both worth your time and you will likely revisit the book many times, if you stay in distributed software design, so do not hesitate to start right now jumping to such a 400+ pages cookbook from the Master of Masters, the Pieter HINTJENS out of questions is.

Why? The real world is typically much more complex

Just one picture, Fig.60 from the above-mentioned book to forget about individual archetype's re-use and to realise the need for a proper end-to-end distributed system design perspective, including blocking-avoidance and deadlock-resolution strategies:

Just to have some idea, look into the following code-example, from a simple distributed messaging, where aMiniRESPONDER process uses multiple ZeroMQ channels.


How to improve your implementation in a fairly large Web PHP-application domain?

Learn how to both prevent ( design-wise ) and handle ( deux-ex-machina type ) other collisions.

PHP has on it's own all proper syntax-constructors for this type of algorithmisation, but the architecture and design is in your hands, from start to end.

Just to realise, how bigger the collisions-aware { try:, except:, finally: } style of a ZeroMQ signalling-infrastructure setup / system-part / ZeroMQ graceful-termination efforts are, check the [SoW] just by row numbers:

14544 - 14800 // a safe infrastructure setup on aMiniRESPONDER side   ~ 256 SLOCs
15294 - 15405 // a safe infrastructure graceful termination          ~ 110 SLOCs

compared to the core-logic of the event-processing segment of aMiniRESPONDER example

14802 - 15293 // aMiniRESPONDER logic, incl. EXC-HANDLERs             ~ 491 SLOCs

A final note on ZeroMQ-based distributed systems

Demanding? Yes, but very powerful, scaleable, fast and indeed rewarding on proper use. Do not hesitate to invest your time and efforts to acquire and manage your knowledge in this domain. All your further software projects may just benefit from this professional investment.



回答2:

I can only partially answer this question. I have no idea why the process would hang after 2-3 days.

Overall a PHP script is loaded once at script execution time, there does not seem to be any way around this limitation. However your current code can be rewritten as follows:

someFile.php

$context = new ZMQContext(1);

//  Socket to talk to clients
$responder = new ZMQSocket($context, ZMQ::SOCKET_REP);
$responder->bind("tcp://*:5555");

while (true) {
    $request = $responder->recv();
    printf ("Received request: [%s]\n", $request);

    $subscriptParams = [
        "Request" => $request //Add more parameters here as needed
    ];

    $result = shell_exec("php work.php ".base64_encode(serialize($subscriptParams)));       

    //  Send reply back to client
    $responder->send("Basic Reply");
}

work.php

if (!isset($argv[0])) {
   die("Need an argument");
}

$params = unserialize(base64_decode($argv[0]));
//Validate parameters
$request = $params["Request"];
//  Do some 'work'
//Process push notifications here!
sleep (1); 

Here's my assumptions:

  1. The setup part of the script e.g. setting the $context and $responder will remain constant forever (or you could live with the downtime required to restart the script because of changes to that).

  2. The start and end of the loop remain constant, my idea is that shell_exec will return a response which can be used by the responder as an actual response.

Some more clarifications:

I'm using serialize to pass an array of arguments which work.php will need to use. I am doing a base64 encoding decoding because I want to ensure the entire argument will fit in $argv[0] and not get split on potential spaces that are found in the arguments.

The same serialize -> base64_encode and base64_decode -> deserialize combination can be used for the result of work.php.

Note that I have not personally tried this code at all so I can't guarantee that it works. I just don't see any reason why it shouldn't work (make sure php is in your path or call /usr/bin/php in your shell_exec if it isn't).

It is worth noting that this solution will be quite slower than having all the code in one file but this is the cost of refreshing the script on each iteration.



回答3:

Q2: And what is the right way to do this?

The right way to do anything in my opinion is to choose the best option available for you at the beginning so as not to invest time in a method that would yield you second grade results. I have nothing against ZeroMQ though I follow the logic that programmers should always strive to do the cleanest code and use the best possible tools. In the case of creating a Message queue with PHP, you would have a lot more success with Pheanstalk: https://github.com/pda/pheanstalk

It is a highly recommended open source option online and works perfectly on linux. Installing the queue is very simple, and I wrote a complete answer on how to install pheanstalk in the following topic: Unable to get Beanstalkd Queue to work for PHP

Pheanstalk uses the beanstalkd library, which is very light weight and efficient. To preform a message queue as your question suggests, you could do so with two simple php scrips:

Message producer:

<?php
$pheanstalk = new Pheanstalk('127.0.0.1:11300');
$pheanstalk
  ->useTube("my_queue")
  ->put("Hello World");
?>

Worker script:

<?php
    if ($job = $pheanstalk
    ->watch('testtube')
    ->ignore('default')
    ->reserve())//retreives the job if there is one in the queue
    {
        echo $job->getData();//echos the message
        $pheanstalk->delete($job);//deletes the job from the queue
    }
}
?>

The message producer would be included in a page where a user would create a message, and that would be sent to a queue generated by beanstalkd. The worker script could be designed in different ways. You could place it in an a while loop to search for a new queue every second, and you could even have multiple workers searching the queue. Pheanstalk is very efficient and highly recommended.