As I have understood so far: Javascript is single threaded. If you defer the execution of some procedure, you just schedule it (queue it) to be run next time the thread is free. But Async.js defines two methods: Async::parallel & Async::parallelLimit
, and I quote:
- parallel(tasks, [callback])
Run an array of functions in parallel, without waiting until the previous function has completed. If any of the functions pass an error to its callback...
- parallelLimit(tasks, limit, [callback])
The same as parallel only the tasks are executed in parallel with a maximum of "limit" tasks executing at any time.
As far as to my understanding of English, when you say: "doing tasks in parallel" means doing them at the same time - simultaneously.
How may Async.js execute tasks in parallel in a single thread? Am I missing something.
The functions are not executed simultaneously, but when the first function handed off to an asynchronous task (e.g. setTimeout, network, ...), the second will start, even if the first function hasn't called the provided callback.
As for the number of parallel tasks: That depends on what you pick.
Your doubts make perfect sense. It's been few years since you asked this question but I think it's worth to add few thinks to the existing answers.
This sentence is not entirely correct. In fact it does wait for each function to have completed because it's impossible not to do so in JavaScript. Both function calls and function returns are synchronous and blocking. So when it calls any function it has to wait for it to return. What it doesn't have to wait for is the calling of the callback that was passed to that function.
Allegory
Some time ago I wrote a short story to demonstrate that very concept:
To quote a part of it:
Theory
I think it's important to emphasize that in single-threaded event loops you can never do more than one thing at once. But you can wait for many things at once just fine. And this is what happens here.
The parallel function from the Async module calls each of the function one by one, but each function has to return before the next one can be called, there is no way around it. The magic here is that the function doesn't really do its job before it returns - it just schedules some task, registers an event listener, passes some callback somewhere else, adds a resolution handler to some promise etc.
Then, when the scheduled task finishes, some handler that was previously registered by that function is executed, this in turns executes the callback that was originally passed by the Async module and the Async module knows that this one function has finished - this time not only in a sense that it returned, but also that the callback that was passed to it was finally called.
Examples
So, for example let's say that you have 3 functions that download 3 different URLs:
getA()
,getB()
andgetC()
.We will write a mock of the Request module to simulate the requests and some delays:
Now the 3 functions that are mostly the same, with verbose logging:
And finally we're calling them all with the
async.parallel
function:What gets displayed immediately is this:
As you can see this is all sequential - functions get called one by one and the next one is not called before the previous one returns. Then we see this with some delays:
So the
getC
finished first, thengetB
andgetC
- and then as soon as the last one finishes, theasync.parallel
calls our callback with all of the responses combined and in correct order - in the order that the function was ordered by us, not in the order that those requests finished.Also we can see that the program finishes after 4.071 seconds which is roughly the time that the longest request took, so we see that the requests were all in progress at the same time.
Now, let's run it with
async.parallelLimit
with the limit of 2 parallel tasks at most:Now it's a little bit different. What we see immediately is:
So
getA
andgetB
was called and returned butgetC
was not called at all yet. Then after some delay we see:which shows that as soon as
getB
called the callback the Async module no longer has 2 tasks in progress but just 1 and can start another one, which isgetC
, and it does so immediately.Then with another delays we see:
which finishes the whole process just like in the
async.parallel
example. This time the whole process also took roughly 4 seconds because the delayed calling ofgetC
didn't make any difference - it still managed to finish before the first calledgetA
finished.But if we change the delays to those ones:
then the situation is different. Now
async.parrallel
takes 4 seconds butasync.parallelLimit
with the limit of 2 takes 5 seconds and the order is slightly different.With no limit:
With a limit of 2:
Summary
I think the most important thing to remember - no matter if you use callbacks like in this case, or promises or async/await, is that in single-threaded event loops you can do only one thing at once, but you can wait for many things at the same time.
parallel
runs all its tasks simultaneously. So if your tasks contain I/O calls (e.g. querying DB), they'll appear as if they've been processed in parallel.Node.js is non-blocking. So instead of handling all tasks in parallel, it switches from one task to another. So when the first task makes I/O call making itself idle, Node.js simply switches to processing another one.
I/O tasks spent most of its processing time waiting for the result of the I/O call. In blocking languages like Java, such a task blocks its thread while it waits for the results. But Node.js utilizes it's time to process another tasks instead of waiting.
Yes, it's almost as you said. Node.js starts processing the first task until it pauses to do an I/O call. At that moment, Node.js leaves it and grants its main thread to another task. So you may say that the thread is granted to each active task in turn.
Async.Parallel is well documented here: https://github.com/caolan/async#parallel
Async.Parallel is about kicking-off I/O tasks in parallel, not about parallel execution of code. If your tasks do not use any timers or perform any I/O, they will actually be executed in series. Any synchronous setup sections for each task will happen one after the other. JavaScript remains single-threaded.
Correct. And "simultaneously" means "there is at least one moment in time when two or more tasks are already started, but not yet finished".
When some task stops for some reason (i.e. IO), async.js executes another task and continues first one later.