How, if at all, do Erlang Processes map to Kernel

2019-01-06 10:23发布

问题:

Erlang is known for being able to support MANY lightweight processes; it can do this because these are not processes in the traditional sense, or even threads like in P-threads, but threads entirely in user space.

This is well and good (fantastic actually). But how then are Erlang threads executed in parallel in a multicore/multiprocessor environment? Surely they have to somehow be mapped to kernel threads in order to be executed on separate cores?

Assuming that that's the case, how is this done? Are many lightweight processes mapped to a single kernel thread?

Or is there another way around this problem?

回答1:

Answer depends on the VM which is used:

1) non-SMP: There is one scheduler (OS thread), which executes all Erlang processes, taken from the pool of runnable processes (i.e. those who are not blocked by e.g. receive)

2) SMP: There are K schedulers (OS threads, K is usually a number of CPU cores), which executes Erlang processes from the shared process queue. It is a simple FIFO queue (with locks to allow simultaneous access from multiple OS threads).

3) SMP in R13B and newer: There will be K schedulers (as before) which executes Erlang processes from multiple process queues. Each scheduler has it's own queue, so process migration logic from one scheduler to another will be added. This solution will improve performance by avoiding excessive locking in shared process queue.

For more information see this document prepared by Kenneth Lundin, Ericsson AB, for Erlang User Conference, Stockholm, November 13, 2008.



回答2:

I want to ammend previous answers.

Erlang, or rather the Erlang runtime system (erts), defaults the number of schedulers (OS threads) and the number of runqueues to number of processing elements on your platform. That is processors cores or hardware threads. You can change these settings in runtime using:

erlang:system_flag(schedulers_online, NP) -> PrevNP

The Erlang processes does not have any affinity to any schedulers yet. The logic balancing the processes between the schedulers follows two rules. 1) A starving scheduler will steal work from another scheduler. 2) Migration paths are setup to push processes from schedulers with lots of processes to schedulers with less work. This is done to assure fairness in reduction count (execution time) for each process.

Schedulers however can be locked to specific processing elements. This not done by default. To let erts do the scheduler->core affinity use:

erlang:system_flag(scheduler_bind_type, default_bind) -> PrevBind

Several other bind types can be found in the documentation. Using affinity can greatly improve performance in heavy load situations! Especially in high lock contention situations. Also, the linux kernel cannot handle hyperthreads to say the least. If you have hyperthreads on your platform you should really use this feature in erlang.



回答3:

I'm purely guessing here, but I'd imagine that there's a small number of threads, which pick processes from a common process pool for execution. Once a process hits a blocking operation, the thread executing it puts it aside and picks another. When a process being executed causes another process to become unblocked, that newly unblocked process gets placed into the pool. I suppose a thread might also stop execution of a process even when it's not blocked at certain points to serve other processes.



回答4:

I would like to add some input to what was described in the accepted answer.

Erlang Scheduler is the essential part of the Erlang Runtime System and provides its own abstraction and implementation of the conception of lightweight processes atop the OS threads.

Each Scheduler runs within a single OS thread. Normally, there are as many schedulers as CPU (cores) are on he hardware (it is configurable though and naturally does not bring much value when number of schedulers exceeds those of hardware cores). The system might also be configured that scheduler will not jump between OS threads.

Now, when the Erlang process is being created it is entirely the responsibility of the ERTS and Scheduler to manage life cycle and resources consumption as well as its memory footprint etc.

One of the core implementation details is that each process has a time budget of 2000 reductions available when the Scheduler picks up that process from the run queue. Each progress in the system (even I/O) is guaranteed to have a reductions budget. That is what actually makes ERTS a system with preemptive multitasking.

I would recommend a great blog post on that topic by Jesper Louis Andersen http://jlouisramblings.blogspot.com/2013/01/how-erlang-does-scheduling.html

As the short answer: Erlang processes are not OS threads and do not map to them directly. Erlang Schedulers are what runs on the OS threads and provide smart implementation of more finely grained Erlang processes hiding those details behind programmer's eyes.