Say there is a process B
, which receives a pid and sends m2
to it. If you spawn A
and send it m1
, and then send A
to B
, is A
guaranteed to get m1
before m2
?
In other words, can this crash?
-module(test).
-compile(export_all).
test() ->
B = spawn_link(fun() -> receive P -> P ! m2 end end),
A = spawn_link(fun() -> receive X -> X=m1 end end),
A ! m1,
B ! A.
Your code cannot crash because all processes are local.
B = spawn_link(fun() -> receive P -> P ! m2 end end), % 1
A = spawn_link(fun() -> receive X -> X=m1 end end), % 2
A ! m1, % 3
B ! A. % 4
When evaluating line 3, both BEAM emulator and HiPE invoke the erl_send built-in function (BIF). Since A is a local process, erl_send (actually do_send) eventually calls erts_send_message which enqueues the message in the mailbox. In SMP mode, the thread actually acquires a lock on the mailbox.
So when evaluating line 4 and sending A to process B, A already has m1 in its mailbox. So m2
can only be enqueued after m1
.
Whether this result is particular of the current implementation of Erlang is debatable, even if this is not guaranteed by documentation. Indeed, each process need a mailbox and this mailbox needs to be filled somehow. This is done synchronously on line 3. To do it asynchronously would either require another thread in-between or several mailboxes per process (e.g. one per scheduler to avoid the lock on the mailbox). Yet I do not think this would make sense performance-wise.
If processes A and B were remote but within the same node, the behavior is slightly different but the result would be the same with the current implementation of Erlang. On line 3, message m1
will be enqueued for the remote node and on line 4, message A
will be enqueued afterwards. When remote node will dequeue messages, it will first write m1
to A's mailbox before writing A
to B's mailbox.
If process A was remote and B was local, the result would still be the same. On line 3, message m1
will be enqueued for the remote node and on line 4, message will be written to B, but then on line 1, message m2
will be enqueued to remote node after m1
. So A will get messages in m1, m2 order.
Likewise, if process A was local and B was remote, A will get the message copied to its mailbox on line 3 before anything is sent over the network to B's node.
With the current version of Erlang, the only way for this to crash is to have A and B on distinct remote nodes. In this case, m1
is enqueued to A's node before A
is enqueued to B's node. However, delivery of these messages is not synchronous. Delivery to B's node could happen first, for example if many messages are already enqueued for A's node.
The following code (sometimes) triggers the crash by filling queue to A's node with junk messages that slow delivery of m1
.
$ erl -sname node_c@localhost
C = spawn_link(fun() ->
A = receive {process_a, APid} -> APid end,
B = receive {process_b, BPid} -> BPid end,
ANode = node(A),
lists:foreach(fun(_) ->
rpc:cast(ANode, erlang, whereis, [user])
end, lists:seq(1, 10000)),
A ! m1,
B ! A
end),
register(process_c, C).
$ erl -sname node_b@localhost
B = spawn_link(fun() -> receive P -> P ! m2 end end),
C = rpc:call(node_c@localhost, erlang, whereis, [process_c]),
C ! {process_b, B}.
$ erl -sname node_a@localhost
A = spawn_link(fun() -> receive X -> X = m1 end, io:format("end of A\n") end),
C = rpc:call(node_c@localhost, erlang, whereis, [process_c]),
C ! {process_a, A}.
If both the two processes are on the same node, it is true that A is guaranteed to get m1 before m2.
But when the two processes are on different nodes, it is not guaranteed.
There is a paper Programming Distributed Erlang Applications: Pitfalls and Recipes
about this problem.
Here is the link: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.116.9929&rep=rep1&type=pdf
Your problem is in 2.2 of this paper, and I think it is really an intereting paper!