Java 8: How can I convert a for loop to run in par

2019-05-21 19:35发布

问题:

for (int i=0; i<100000; i++) {
    // REST API request. 
restTemplate.exchange(url, HttpMethod.GET, request, String.class);
}

I have a situation where I have to request a resource for 100k users and it takes 70 minutes to finish. I tried to clean up my code as much as possible and I was able to reduce it only by 4 minutes).

Since each request is independent of each other, I would love to send requests in parallel (may be in 10s, 100s, or even 1000s of chunks which every finishes quickly). I'm hoping that I can reduce the time to 10 minutes or something close. How do I calculate which chunk size would get the job done quickly?

I have found the following way but I can't tell if the program processes all the 20 at a time; or 5 at a time; or 10 at a time.

IntStream.range(0,20).parallel().forEach(i->{
     ... do something here
});

I appericiate your help. I am open to any suggestions or critics!!

UPDATE: I was able to use IntStream and the task finished in 28 minutes. But I am not sure this is the best I could go for.

回答1:

I used the following code in Java 8 and it did the work. I was able to reduce the batch job to run from 28 minutes to 3:39 minutes.

IntStream.range(0, 100000).parallel().forEach(i->{
     restTemplate.exchange(url, HttpMethod.GET, request, String.class);
}
});


回答2:

The standard call to parallel() will create a thread for each core your machine has available minus one core, using a Common Fork Join Pool.

If you want to specify the parallelism on your own, you will have different possibilities:

  1. Change the parallelism of the common pool: System.setProperty("java.util.concurrent.ForkJoinPool.common.parallelism", "20")
  2. Use an own pool:

Example:

int allRequestsCount = 20;
int parallelism = 4; // Vary on your own

ForkJoinPool forkJoinPool = new ForkJoinPool(parallelism);
IntStream.range(0, parallelism).forEach(i -> forkJoinPool.submit(() -> {
  int chunkSize = allRequestsCount / parallelism;
  IntStream.range(i * chunkSize, i * chunkSize + chunkSize)
           .forEach(num -> {

             // Simulate long running operation
             try {
               Thread.sleep(1000);
             } catch (InterruptedException e) {
               e.printStackTrace();
             }

             System.out.println(Thread.currentThread().getName() + ": " + num);
           });
}));

This implementation is just examplary to give you an idea.



回答3:

For your situation you can work with fork/join framework or make executor service pool of threads.

      ExecutorService service = null;
    try {

        service = Executors.newFixedThreadPool(8);
        service.submit(() -> {

            //do your task
        });
    } catch (Exception e) {
    } finally {
        if (service != null) {
            service.shutdown();
        }

    }
    service.awaitTermination(1, TimeUnit.MINUTES);
    if(service.isTerminated())
        System.out.println("All threads have been finished");
    else 
        System.out.println("At least one thread running");

And using fork/join framework

    class RequestHandler extends RecursiveAction {

    int start;
    int end;

    public RequestHandler(int start, int end) {
        this.start = start;
        this.end = end;
    }

    @Override
    protected void compute() {
        if (end - start <= 10) {

            //REST Request
        } else {

            int middle = start + (end - start) / 2;
            invokeAll(new RequestHandler(start, middle), new RequestHandler(middle, end));
        }

    }

}

Public class MainClass{
   public void main(String[] args){

       ForkJoinTask<?> task = new RequestHandler(0, 100000);
       ForkJoinPool pool = new ForkJoinPool();
       pool.invoke(task);
   }
}


回答4:

I've written a short article about that. It contains simple tool that allows you to control pool size:

https://gt-dev.blogspot.com/2016/07/java-8-threads-parallel-stream-how-to.html