I came across the Asynchronous processing of requests by Servlets, as I was exploring how a NodeJS application and a Java application handles a request.
From what I have read in different places:
The request will be received and processed by a HTTP thread from the Servlet Container and in case of blocking operations (like I/O), the request can be handed over to another Threadpool and the HTTP thread which received the request can go back to receive and process the next request.
The time-consuming blocking operation will now be taken up by a worker from the Threadpool.
If what I had understood is correct, I have the following question:
Even the thread that processes the blocking operation is going to wait for that operation to complete and hence blocking the resources(and number of threads processed is equal to the number of cores), if I am right.
What exactly is the gain here by using of asynchronous processing?
If not, enlighten me please.
I can explain the benefits in terms of Node.js (equally applicable elsewhere).
The Problem. Blocking Network IO.
Suppose you want to create a connection with your server, in order to read from the connection you will need a thread T1 which will read data over network for that connection, this read method is blocking i.e your thread will wait indefinitely till there is any data to read. Now suppose you have another connection around that time, now to handle this connection you have to create another Thread T2. Its quite possible that this thread may again be blocked for reading data on the second connection, so it means you can handle as many connections as you can handle threads in your system. This is called a Thread Per Request Model. Creating lot of threads will degrade your system performance due to lot of context switching and scheduling. This model doesn't scale well.
Solution :
A little Background, there is a method in FreeBSD/Linux called as kqueue/epoll. Both of these methods accepts a list of socketfd(as function params), the calling thread gets blocked till one or more sockets have data ready to read, and these methods return a sublist of those ready connections. Ref. http://austingwalters.com/io-multiplexing/
Now Assuming you got a feeling for the above methods. Imagine there is Thread Called as EventLoop which calls the above method epoll/kqueue.
So in java your code will look something like this.
So now you see the above method can accept many more connections than Thread per request model also the worker threads are also not stuck as they will read only those connection which have data. When worker threads will read the whole data they will queue there response or data on a queue with a callback handler you provided at the time of listening. this callback method will again be executed By the Event Loop Thread.
The above approach has two disadvantages.
First disadvantage can be taken care of Clustered Node.js i.e kind one node.js process corresponding to each core of the cpu.
Anyways Have a look at vert.x this is kind of similar node.js but in java. Also explore Netty.
Yes, in this scenario blocking operation will execute in it's own thread and will be blocking some resources, but your HTTP thread is now free to process some other operations that might be not so time-consuming.
Your gain of asynchronous processing is ability to continue handling other requests while waiting heavyweight operation response instead of dumb blocking HTTP thread.