Erlang Node to Node Messaging Throughput, Timeouts

2019-04-10 21:21发布

问题:

Now, suppose we are designing an application, consists of 2 Erlang Nodes. On Node A, will be very many processes, in the orders of thousands. These processes access resources on Node B by sending a message to a registered process on Node B.

At Node B, lets say you have a process started by executing the following function:

start_server()->
    register(zeemq_server,spawn(?MODULE,server,[])),ok.
server()-> receive {{CallerPid, Ref}, {Module, Func, Args}} -> Result = (catch erlang:apply(Module, Func, Args)), CallerPid ! {Ref, Result}, server(); _ -> server() end.
On Node A, any process that wants to execute any function in a given module on Node B, uses the following piece of code:
call(Node, Module, Func, Args)->
        Ref = make_ref(),
        Me = self(),
        {zeemq_server,Node} ! {{Me, Ref}, {Module, Func, Args}},
        receive
                {Ref, Result} -> Result
        after timer:minutes(3) -> 
            error_logger:error_report(["Call to server took so long"]),
            {error,remote_call_failed}
        end.
So assuming that Process zeemq_server on Node B, will never be down, and that the network connection between Node A and B is always up, please answer the following questions:

Qn 1: Since there is only one receiving process on Node B, its mail box is most likely to be full , all the time. This is because, the processes are many on Node A and at a given interval, say, 2 seconds, every process at least ,makes a single call to the Node B server. In which ways, can the reception be made redundant on the Node B ? , e.g. Process Groups e.t.c. and explain (the concept) how this would replace the server side code above. Show what changes would happen on the Client side.

Qn 2: In a situation where there is only one receiver on Node B, is there a maximum number of messages allowable in the process mail box ? how would erlang respond , if a single process mail ox is flooded with too many messages ?

Qn 3: In what ways, using the very concept showed above, can i guarantee that every process which sends a request , gets back an answer as soon as possible before the timeout occurs ? Could converting the reception part on the Node B to a parallel operation help ? like this:
start_server()->
    register(zeemq_server,spawn(?MODULE,server,[])),ok.
server()-> receive {{CallerPid, Ref}, {Module, Func, Args}} -> spawn(?MODULE,child,[Ref,CallerPid,{Module, Func, Args}]), server(); _ -> server() end.
child(Ref,CallerPid,{Module, Func, Args})-> Result = (catch erlang:apply(Module, Func, Args)), CallerPid ! {Ref, Result}, ok.
The method showed above, may increase the instantaneous number of processes running on the Node B, and this may affect the service greatly due to memory. However, it looks good and makes the server() loop to return immediately to handle the next request. What is your take on this modification ?

Lastly : Illustrate how you would implement a Pool of receiver Threads on Node B, yet appearing to be under one Name as regards Node A. Such that, incoming messages are multiplexed amongst the receiver threads and the load shared within this group of processes. Keep the meaning of the problem the same.

Thank you Erlangers !

回答1:

The maximum number of messages in a process mailbox is unbounded, except by the amount of memory.

Also, if you need to inspect the mailbox size, use

erlang:process_info(self(),[message_queue_len,messages]).

This will return something like:

[{message_queue_len,0},{messages,[]}]

What I suggest is that you first convert your server above into a gen_server. This your worker.

Next, I suggest using poolboy ( https://github.com/devinus/poolboy ) to create a pool of instances of your server as poolboy workers (there are examples in their github Readme.md). Lastly, I suggest creating a module for callers with a helper method that creates a poolboy transaction and applies a Worker arg from the pool to a function. Example below cribbed from their github:

squery(PoolName, Sql) ->
    poolboy:transaction(PoolName, fun(Worker) ->
                                     gen_server:call(Worker, {squery, Sql})
                                  end).

That said, would Erlang RPC suit your needs better? Details on Erlang RPC at http://www.erlang.org/doc/man/rpc.html. A good treatment of Erlang RPC is found at http://learnyousomeerlang.com/distribunomicon#rpc.



回答2:

IMO spawning a new process to handle each request may be overkill, but it's hard to say without knowing what has to be done with each request.

You can have a pool of process to handle each msg, using a round robin method to distribute the requests or based on type of request ether handle it, send it to a child process or spawn a process. You can also monitor the load of the pooled processes by looking at their msg queues and starting new children if they are overloaded. Using a supervisor.. just use a send_after in the init to monitor the load every few seconds and act accordingly. Use OTP if you can, there's overhead but it is worth it.

I wouldn't use http for a dedicated line communication, I believe it's too much overhead. You can control the load using a pool of processes to handle it.