Why do many people say I/O completion port is fast and nice model?
What is the I/O completion port's advantages and disadvantages?
I want to know some points which make faster IOCP than other model.
If you can explain it comparing other models(select, epoll, traditional multi thread/process), it would be better.
I/O completion ports are awesome. There's no better word to describe them. If anything in Windows was done right, it's completion ports.
You can create some number of threads (does not really matter how many) and make them all block on one completion port until an event (either one you post manually, or an event from a timer or asynchronous I/O, or whatever) arrives. Then the completion port will wake one thread to handle the event, up to the limit that you specified. If you didn't specify anything, it will assume "up to number of CPU cores", which is really nice.
If there are already more threads active than the maximum limit, it will wait until one of them is done and then hand the event to the thread as soon as it goes to wait state. Also, it will always wake threads in a LIFO order, so chances are that caches are still warm.
In other words, completion ports are a no-fuss "poll for events" as well as "fill CPU as much as you can" solution.
You can throw file reads and writes at a completion port, sockets, or anything else that's waitable. And, you can post your own events if you want. Each custom event has at least one integer and one pointer worth of data (if you use the default structure), but you are not really limited to that as the system will happily accept any other structure too.
Also, completion ports are fast, really really fast. Once upon a time, I needed to notify one thread from another. As it happened, that thread already had a completion port for file I/O, but it didn't pump messages. So, I wondered if I should just bite the bullet and use the completion port for simplicity, even though posting a thread message would obviously be much more efficient. I was undecided, so I benchmarked. Surprise, it turned out completion ports were about 3 times faster. So... faster and more flexible, the decision was not hard.
by using IOCP, we can overcome the "one-thread-per-client" problem. It is commonly known that the performance decreases heavily if the software does not run on a true multiprocessor machine. Threads are system resources that are neither unlimited nor cheap.
IOCP provides a way to have a few (I/O worker) threads handle multiple clients' input/output "fairly". The threads are suspended, and don't use the CPU cycles until there is something to do.
Also you can read some information in this nice book http://www.amazon.com/Windows-System-Programming-Johnson-Hart/dp/0321256190
I/O completion ports are provided by the O/S as an asynchronous I/O operation, which means that it occurs in the background (usually in hardware). The system does not waste any resources (e.g. threads) waiting for the I/O to complete. When the I/O is complete, the hardware sends an interrupt to the O/S, which then wakes up the relevant process/thread to handle the result. WRONG: IOCP does NOT require hardware support (see comments below)
Typically a single thread can wait on a large number of I/O completions while taking up very little resources when the I/O has not returned.
Other async models that are not based on I/O completion ports usually employ a thread pool and have threads wait for I/O to complete, thereby using more system resources.
The flip side is that I/O completion ports usually require hardware support, and so they are not generally applicable to all async scenarios.