I am trying to figure out how SPIN chooses the order in which to execute and terminate processes in the following example. I realize that a main focus of SPIN is analyzing concurrent processes, but for my purposes I am just interested in simple linear execution. In the following example I just want step1() then step2() to be executed in that order.
int globA;
int globB;
proctype step1() {
atomic {
globA = 1;
}
}
proctype step2() {
atomic {
globB = 2;
}
}
init {
atomic {
run step1();
run step2();
}
}
However, even with the atomic {} declarations, the processes become interleaved in their execution. Using the command spin -p my_pml_code.pml
I get the following 3 outputs only (I ran it many times and these were the only outputs).
Run 1:
0: proc - (:root:) creates proc 0 (:init:)
Starting step1 with pid 1
1: proc 0 (:init::1) creates proc 1 (step1)
1: proc 0 (:init::1) pml_testing/transition_testing.pml:16 (state 1) [(run step1())]
Starting step2 with pid 2
2: proc 0 (:init::1) creates proc 2 (step2)
2: proc 0 (:init::1) pml_testing/transition_testing.pml:17 (state 2) [(run step2())]
3: proc 2 (step2:1) pml_testing/transition_testing.pml:11 (state 1) [globB = 2]
4: proc 1 (step1:1) pml_testing/transition_testing.pml:6 (state 1) [globA = 1]
4: proc 2 (step2:1) terminates
4: proc 1 (step1:1) terminates
4: proc 0 (:init::1) terminates
3 processes created
The order is proc 0 -> 0 -> 0 -> 0 ->2 -> 1 -> 2 -> 1 -> 0
Run 2:
0: proc - (:root:) creates proc 0 (:init:)
Starting step1 with pid 1
1: proc 0 (:init::1) creates proc 1 (step1)
1: proc 0 (:init::1) pml_testing/transition_testing.pml:16 (state 1) [(run step1())]
Starting step2 with pid 2
2: proc 0 (:init::1) creates proc 2 (step2)
2: proc 0 (:init::1) pml_testing/transition_testing.pml:17 (state 2) [(run step2())]
3: proc 1 (step1:1) pml_testing/transition_testing.pml:6 (state 1) [globA = 1]
4: proc 2 (step2:1) pml_testing/transition_testing.pml:11 (state 1) [globB = 2]
4: proc 2 (step2:1) terminates
4: proc 1 (step1:1) terminates
4: proc 0 (:init::1) terminates
3 processes created
The order is proc 0 -> 0 -> 0 -> 0 -> 1 -> 2 -> 2 -> 1 -> 0
Run 3:
0: proc - (:root:) creates proc 0 (:init:)
Starting step1 with pid 1
1: proc 0 (:init::1) creates proc 1 (step1)
1: proc 0 (:init::1) pml_testing/transition_testing.pml:16 (state 1) [(run step1())]
Starting step2 with pid 2
2: proc 0 (:init::1) creates proc 2 (step2)
2: proc 0 (:init::1) pml_testing/transition_testing.pml:17 (state 2) [(run step2())]
3: proc 2 (step2:1) pml_testing/transition_testing.pml:11 (state 1) [globB = 2]
3: proc 2 (step2:1) terminates
4: proc 1 (step1:1) pml_testing/transition_testing.pml:6 (state 1) [globA = 1]
4: proc 1 (step1:1) terminates
4: proc 0 (:init::1) terminates
3 processes created
The order is proc 0 -> 0 -> 0 -> 0 ->2 -> 2 -> 1 -> 1 -> 0
The output I am trying to get it simply: proc 0 -> 0 -> 0 -> 0 ->1 -> 1 -> 2 -> 2 -> 0
I realize that proc 0 can't terminate until 1 and 2 have terminated, but why are the terminations of 2 and 1 being interleaved non-deterministicly? Any why is SPIN randomly choosing between executing proc 1 and proc 2 when the init function is atomic and therefore should be executed in order? And why can I get proc 2 to start and terminate before proc 1 (in Run 3) but not the other way around?
Note: I have also tested this using dstep
instead of atomic
and I get the same result.
First, let me try give short answers to each of your questions:
1. I realise that proc 0 can't terminate until 1 and 2 have terminated, but why are the terminations of 2 and 1 being interleaved non-deterministically?
The processes always terminate in a deterministic fashion: 2
terminates before 1
, 1
before 0
and 0
is always the last one. However, there is nothing special about process termination: it is simply the final transition that is taken by a process. As a result, process interleaving is possible at any point in time in which there are more than one process with an (immediately) executable instruction.
2. Any why is SPIN randomly choosing between executing proc 1 and proc 2 when the init function is atomic and therefore should be executed in order?
Although it is true that init
executes both of his instructions atomically, the important fact to keep in mind is that step1
and step2
are independent processes and are executed after init
exits its atomic block: run
is not a function call, it just spawns a process inside the environment with absolutely no guarantee that such process is executed immediately. That might or might not happen depending on whether the spawned process has any executable instruction, whether the process that is currently executing is in an atomic sequence and on the outcome of the non-deterministic scheduler process selection.
3. And why can I get proc 2 to start and terminate before proc 1 (in Run 3) but not the other way around?
In Promela, processes can only die in reverse order of their creation, as indicated in the docs:
When a process terminates, it can only die and make its _pid number
available for the creation of another process, if and when it has the
highest _pid number in the system. This means that processes can only
die in the reverse order of their creation (in stack order).
Therefore, 2
can terminate before 1
because it has higher _pid
value, whereas 1
must wait upon 2
to be terminate before it can die.
4. How does SPIN decide the order of process execution in atomic processes?
There is no such thing as an atomic process if you have more than one in your system. Even if you enclose the whole body of a process inside the atomic keyword, the termination step is still outside the atomic block. The scheduler never interrupts a process executing an atomic sequence, unless the process blocks in front of an instruction that is not executable. In such a case atomicity is lost and any other process can be scheduled for execution, as described in the docs:
If any statement within the atomic sequence blocks, atomicity is lost,
and other processes are then allowed to start executing statements.
When the blocked statement becomes executable again, the execution of
the atomic sequence can be resumed at any time, but not necessarily
immediately. Before the process can resume the atomic execution of the
remainder of the sequence, the process must first compete with all
other active processes in the system to regain control, that is, it
must first be scheduled for execution.
In your question you state that your goal is to obtain the following execution-flow:
proc 0 -> 0 -> 0 -> 0 ->1 -> 1 -> 2 -> 2 -> 0
In your code example, this execution-flow is forbidden because it makes process 1
terminate before process 2
and this is not allowed by the rules (see the answer to your third question).
Note: I have also tested this using dstep instead of atomic and I get the same result.
No statement inside your atomic block can block, therefore there is absolutely no difference among using d_step
or atomic
in your code. However, I invite you to read this answer to get an insight of similarities and differences among atomic and d_step.
EXAMPLE EXECUTION FLOW:
Second, let me try a deeper answer level based on the analysis of the execution-flow.
In your code example there are three processes.
init
is (always) the first process to be created (when available), and for this reason it is (always) assigned _pid
equal to 0
and scheduled for first. Since the whole body of the init
process is enclosed in an atomic block, this process executes run step1();
and run step2()
without interleaving with other processes. Process step1
is assigned _pid
equal 1
, because it's the second process to be created, whereas process step2
is assigned _pid
equal 2
, since it's the third process to be created.
In your example, processes step1
and step2
cannot be scheduled for execution until when the init
process reaches the end of the atomic, which in your code example coincides with the end of the init
code.
When init
reaches the end of its body, the process (with _pid
equal to 0
) cannot die yet because inside the environment there is at least one process with a _pid
value greater than his own, namely step1
and step2
. Although init
is blocked, both step1
and step2
are ready for execution, so the non-deterministic scheduler can arbitrarily select either step1
or step2
for execution.
If step1
is scheduled first, then it executes its only instruction globA = 1;
without interleaving with step2
. Notice that, since there is only one instruction inside the atomic block and this instruction is atomic on his own, the atomic block is redundant (the same holds for step2
). Again, since step1
has _pid
equal to 1
and there is still a process with higher _pid
value around, process step1
cannot die yet. At this point, the only process that can be scheduled for execution is step2
, which can also terminate because there is no process with higher _pid
value. This allows step1
to terminate, which in turn allows init
to die as well. This execution-flow corresponds to your run 2.
If step2
is scheduled first, then once this process has assigned the value 2
to globB
and reaches the end of its body, which is outside the atomic block, there are two possible execution flows:
case A) the scheduler non-deterministically selects step2
to execute again, and step2
terminates; now the only available option for the scheduler is to make step1
execute its own instruction, make it terminate and then allow init
to terminate too. This execution flow corresponds to run 1.
case B) the scheduler non-deterministically selects step1
to execute, step1
assigns 1
to globA
but cannot terminate because step2
is still alive; the only process that can be scheduled is step2
, so the latter terminates after being selected by the scheduler, allowing step1
and init
to terminate as well in cascade. This execution flow corresponds to run 3.
LINEAR EXECUTION:
The simplest and most obvious way to achieve linear execution is to have only one process inside your model. It is trivial to see why this is the case. So your example would become:
int globA;
int globB;
inline step1()
{
globA = 1;
}
inline step2()
{
globB = 2;
}
init
{
step1();
step2();
}
Having atomic blocks in this code is no longer necessary, since there is only one process. Of course, you might frown upon such a trivial solution, so let's see another solution based on a global variable:
int globA;
int globB;
bool terminated;
proctype step1()
{
globA = 1;
terminated = true;
}
proctype step2()
{
globB = 2;
terminated = true;
}
init
{
run step1();
terminated -> terminated = false;
run step2();
terminated -> terminated = false;
}
Differently than your code example, here globB = 2
can never be executed before globA = 1
is executed thanks to the synchronisation variable terminated
. However, similarly to your code example, the actual termination of processes step1
and step2
is subject to interleaving. i.e. if step1
terminates immediately, so that step2
is created only after step1
has completely released the resources it owns, then step2
is assigned _pid
equal to 1
; otherwise, step2
is assigned _pid
equal to 2
.
The best solution I can think of is based on the concept of message passing. Basically, the idea is to allow only the process that currently holds a token to be scheduled at any given point in time, and to pass around such token in the desired scheduling order:
int globA;
int globB;
mtype = { TOKEN };
proctype step1(chan in, out)
{
in ? TOKEN ->
globA = 1;
out ! TOKEN;
}
proctype step2(chan in, out)
{
in ? TOKEN ->
globB = 2;
out ! TOKEN;
}
init
{
chan token_ring[2] = [0] of { mtype };
run step1(token_ring[0], token_ring[1]);
run step2(token_ring[1], token_ring[0]);
token_ring[0] ! TOKEN;
token_ring[0] ? TOKEN;
}
Notice that this solution forces only one possible scheduling. This can be verified by running an interactive simulation:
~$ spin -i ring.pml
Select a statement
choice 2: proc 0 (:init::1) ring.pml:25 (state 2) [(run step2(token_ring[1],token_ring[0]))]
Select [1-2]: 2
Select a statement
choice 3: proc 0 (:init::1) ring.pml:27 (state 3) [token_ring[0]!TOKEN]
Select [1-3]: 3
Select a statement
choice 2: proc 1 (step1:1) ring.pml:9 (state 2) [globA = 1]
Select [1-3]: 2
Select a statement
choice 2: proc 1 (step1:1) ring.pml:10 (state 3) [out!TOKEN]
Select [1-3]: 2
Select a statement
choice 1: proc 2 (step2:1) ring.pml:16 (state 2) [globB = 2]
Select [1-3]: 1
Select a statement
choice 1: proc 2 (step2:1) ring.pml:17 (state 3) [out!TOKEN]
Select [1-3]: 1
Select a statement
choice 1: proc 2 (step2:1) ring.pml:18 (state 4) <valid end state> [-end-]
Select [1-3]: 1
Select a statement
choice 1: proc 1 (step1:1) ring.pml:11 (state 4) <valid end state> [-end-]
Select [1-2]: 1
3 processes created
As you can see, the user is never offered the chance to make any choice, because there is only one possible execution-flow. This is obviously due to the fact that A) I did not put any instruction before in!TOKEN
and after out!TOKEN
B) the desired scheduling order coincides with the order in which processes are created.