I have made an OTP compliant application where i have a gen_server and a supervisor. Also i have a script to start them.
My script contains something like this. erl -pa module_name/ebin -name abc@hostname -setcookie test -s module_sup start_link()
This does not start the supervisor. But when i do module_sup:start_link() inside the shell, it works.
Also when i do erl -pa module_name/ebin -name abc@hostname -setcookie test -s module_srv start_link() i.e the server alone without the supervisor, the server gets started.
So, what am i doing wrong here. Are we not allowed to start supervisor in such a way.
Any help would be highly appriciated.
Thanx, Wilson
supervisor:start_link/2
creates a link to its calling process. when that calling process exits, the supervisor is taken down with it.erl -s module_sup start_link
is starting the supervisor but it is killed because your start function runs inside its own process which dies once the function exits.you can observe similar behavior with
spawn(module_sup, start_link, []).
the supervisor starts and gets killed immediately. when you manually start the supervisor, the calling process is the shell. when the shell exits, it will kill the supervisor.generally the top-level supervisor is meant to be started by an application.
This is very similar to How do I start applications by command line as a daemon? In short, you can't use -s to start a supervisor unless use
unlink/1
, which is a total kludge. Your time is better spent learning how to package your code as an application. I'd recommend doing this with rebar.It is important to notice that a process only dies if the linked process is terminating with a reason other than 'normal', which means that a process that simply finishes its execution does not kill the processes linked to it. (source http://www.erlang.org/doc/reference_manual/processes.html#id204170) I think that is an important aspect of Erlang that should not be misinterpreted.
The following source code shows this:
You can see that the caller <0.37.0> is not running, but the process <0.38.0> is still there, waiting for a message.
Anyway, the supervisor will not terminate when the caller terminates since the supervisor traps exit signals. Of course, unless it is programmed to do so. But I examined the source code and couldn't find this, but alas, my analysis may have been too superficial.
Have you had any luck with that? I will try to run some tests and see if I can figure out what is happening.