I want to read and parse a lot of files. Since there are over 10000 files that are to be parsed, I want to make this process faster by making use of threads.
For example, if I had 5 threads, I want to have them all read a certain number of files concurrently so that the process of reading and parsing is faster. Is this possible? Would I gain any significant speed-up by splitting this up into threads? If so, how can I do this?
P.S. I am not against using external libraries.
I am working with jdk 1.6
See How to read all lines of a file in parallel in Java 8 for reading one file in parallel.
In your case, I'd just launch a pool of threads with as many threads as your process will allow, each with a "read the whole file" request for a file assigned to it, and let the OS decide which files to read in which order.
If you have many files to read, the better approach is to have no more than one thread read each file. And the best way of handling many tasks with multiple threads , for most cases, is to use an ExecutorService that uses a thread pool. Submit a task to the service for each file to be read. Make the thread pool large enough to keep the I/O system busy (which is likely to be the bottleneck) and you will maximize performance.