How to make a thread limit in Java

2020-02-09 09:06发布

问题:

Let's say I have 1000 files to read and because of some limits, I want to read maximum 5 files in parallel. And, as soon as one of them is finished, I want a new one starts.

I have a main function who have the list of the files and I try changing a counter whenever one thread is finished. but it doesn't works!

Any suggestion?

The following is the main function loop

for (final File filename : folder.listFiles()) {

    Object lock1 = new Object();
    new myThread(filename, lock1).start();
    counter++;
    while (counter > 5);
}

回答1:

Spawning threads like this is not the way to go. Use an ExecutorService and specify the pool to be 5. Put all the files in something like a BlockingQueue or another thread-safe collection and all the executing ones can just poll() it at will.

public class ThreadReader {

    public static void main(String[] args) {
        File f = null;//folder
        final BlockingQueue<File> queue = new ArrayBlockingQueue<File>(1000);
        for(File kid : f.listFiles()){
            queue.add(kid);
        }

        ExecutorService pool = Executors.newFixedThreadPool(5);

        for(int i = 1; i <= 5; i++){
            Runnable r = new Runnable(){
                public void run() {
                    File workFile = null;
                    while((workFile = queue.poll()) != null){
                        //work on the file.
                    }
                }
            };
            pool.execute(r);
        }
    }
}


回答2:

The approach in Kylar's answer is the correct one. Use the executor classes provided by the Java class libraries rather than implementing thread pooling yourself from scratch (badly).


But I thought it might be useful to discuss the code in your question and why it doesn't work. (I've filled in some of the parts that you left out as best I can ...)

public class MyThread extends Thread {

    private static int counter;

    public MyThread(String fileName, Object lock) {
        // Save parameters in instance variables
    }

    public void run() {
        // Do stuff with instance variables
        counter--;
    }

    public static void main(String[] args) {
        // ...
        for (final File filename : folder.listFiles()) {
            Object lock1 = new Object();
            new MyThread(filename, lock1).start();
            counter++;
            while (counter > 5);
        }
        // ...
    }
}

OK, so what is wrong with this? Why doesn't it work?

Well the first problem is that in main you are reading and writing counter without doing any synchronization. I assume that it is also being updated by the worker threads - the code makes no sense otherwise. So that means that there is a good chance that the main threads won't see the result of the updates made by the child threads. In other words, while (counter > 5); could be an infinite loop. (In fact, this is pretty likely. The JIT compiler is allowed to generate code in which the counter > 5 simply tests the value of counter left in a register after the previous counter++; statement.

The second problem is that your while (counter > 5); loop is incredibly wasteful of resources. You are telling the JVM to poll a variable ... and it will do this potentially BILLIONS of times a second ... running one processor (core) flat out. You shouldn't do that. If you are going to implement this kind of stuff using low-level primitives, you should use Java's Object.wait() and Object.notify() methods; e.g. the main thread waits, and each worker thread notifies.



回答3:

You can use an ExecutorService as a thread pool AND a queue.

ExecutorService pool = Executors.newFixedThreadPool(5);
File f = new File(args[0]);

for (final File kid : f.listFiles()) {
    pool.execute(new Runnable() {
        @Override
        public void run() {
            process(kid);
        }
    });
}
pool.shutdown();
// wait for them to finish for up to one minute.
pool.awaitTermination(1, TimeUnit.MINUTES);


回答4:

Whatever method you are using to create a new Thread, increment a global counter, add a conditional statement around the thread creation that if the limit has been reached then don't create a new thread, maybe push the files onto a queue (a list?) and then you could add another conditional statement, after a thread is created, if there are items in the queue, to process those items first.