future.isDone returns false even if the task is do

2020-06-19 06:36发布

问题:

I have tricky situation, Does future.isDone() returns false, even if the thread is done.

import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;
import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.Callable;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.Future;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;

public class DataAccessor {
    private static ThreadPoolExecutor executor;
    private int timeout = 100000;
    static {
        executor = new ThreadPoolExecutor(10, 10, 1000, TimeUnit.SECONDS, new ArrayBlockingQueue<Runnable>(1000));
    }

    public static void main(String[] args) {
        List<String> requests = new ArrayList<String>();
        for(int i=0; i<20; i++){
            requests.add("request:"+i);
        }
        DataAccessor dataAccessor = new DataAccessor();

        List<ProcessedResponse> results = dataAccessor.getDataFromService(requests);
        for(ProcessedResponse response:results){
            System.out.println("response"+response.toString()+"\n");
        }
        executor.shutdown();
    }

    public List<ProcessedResponse> getDataFromService(List<String> requests) {
        final CountDownLatch latch = new CountDownLatch(requests.size());
        List<SubmittedJob> submittedJobs = new ArrayList<SubmittedJob>(requests.size());
        for (String request : requests) {
            Future<ProcessedResponse> future = executor.submit(new GetAndProcessResponse(request, latch));
            submittedJobs.add(new SubmittedJob(future, request));
        }
        try {
            if (!latch.await(timeout, TimeUnit.MILLISECONDS)) {
                // some of the jobs not done
                System.out.println("some jobs not done");
            }
        } catch (InterruptedException e1) {
            // take care, or cleanup
            for (SubmittedJob job : submittedJobs) {
                job.getFuture().cancel(true);
            }
        }
        List<ProcessedResponse> results = new LinkedList<DataAccessor.ProcessedResponse>();
        for (SubmittedJob job : submittedJobs) {
            try {
                // before doing a get you may check if it is done
                if (!job.getFuture().isDone()) {
                    // cancel job and continue with others
                    job.getFuture().cancel(true);
                    continue;
                }
                ProcessedResponse response = job.getFuture().get();
                results.add(response);
            } catch (ExecutionException cause) {
                // exceptions occurred during execution, in any
            } catch (InterruptedException e) {
                // take care
            }
        }
        return results;
    }

    private class SubmittedJob {
        final String request;
        final Future<ProcessedResponse> future;

        public Future<ProcessedResponse> getFuture() {
            return future;
        }

        public String getRequest() {
            return request;
        }

        SubmittedJob(final Future<ProcessedResponse> job, final String request) {
            this.future = job;
            this.request = request;
        }
    }

    private class ProcessedResponse {
        private final String request;
        private final String response;

        ProcessedResponse(final String request, final String response) {
            this.request = request;
            this.response = response;
        }

        public String getRequest() {
            return request;
        }

        public String getResponse() {
            return response;
        }

        public String toString(){
            return "[request:"+request+","+"response:"+ response+"]";
        }
    }

    private class GetAndProcessResponse implements Callable<ProcessedResponse> {
        private final String request;
        private final CountDownLatch countDownLatch;

        GetAndProcessResponse(final String request, final CountDownLatch countDownLatch) {
            this.request = request;
            this.countDownLatch = countDownLatch;
        }

        public ProcessedResponse call() {
            try {
                return getAndProcessResponse(this.request);
            } finally {
                countDownLatch.countDown();
            }
        }

        private ProcessedResponse getAndProcessResponse(final String request) {
            // do the service call
            // ........
            if("request:16".equals(request)){
                throw (new RuntimeException("runtime"));
            }
            return (new ProcessedResponse(request, "response.of." + request));
        }
    }
}

if I call future.isDone() it returns false though the coundownLatch.await() return true. Any Idea? Also to note that the countDownLatch.await comes out immediately when this happens.

If you are finding the format not readable view here, http://tinyurl.com/7j6cvep .

回答1:

The issue is most likely one of timing. the latch will be released before all of the tasks are actually complete with regards to the Future (because the countDown() invocation is within the call() method).

you are basically recreating the work of a CompletionService (implementation is ExecutorCompletionService), i would recommend using that instead. you can use the poll(timeout) method to get the results. just keep track of the total time and make sure you reduce your timeout on each call to the total remaining time.



回答2:

As jtahlborn mentioned this is probably a race condition in which the CountdownLatch signals its waiting threads, which the waiting threads evaluates the Future's cancel condition before the FutureTask finishes its execution (which will occur at some point after the countDown).

You simply cannot rely on the synchronization mechanisms of the CountdownLatch to be in sync with the sync mechanisms of a Future. What you should do is rely on the Future to tell you when it is done.

You can Future.get(long timeout, TimeUnit.MILLISECONDS) instead of CountdownLatch.await(long timeout, TimeUnit.MILLISECONDS). To get the same type of effect as the latch you can add all the Futures to a List, iterate over the list and get on each Future.



回答3:

Here is the scenario of the race condition:

  • The main thread is in latch.await, it receives no CPU slots from Java scheduler for milliseconds
  • The last executor thread calls countDownLatch.countDown() in the finally clause
  • The Java scheduler decides to give more priority to the main thread because it as waited for a while
  • As a result, when it asks for the last Future result, it is not available yet because the last executor thread gets no time slice to propagate the result, it is still in finally...

I have not found a detailed explanation about how Java scheduler really works, probably because it mainly depends on the operating system running the JVM but generally speaking it tries to equally give the CPU to runnable threads in average on a period of time. That is why the main thread can reach the isDone test before the other one left the finally clause.

I propose you change your results' collect after latch.await. As you know the latch has been decreased to zero (except if main thread was interrupted), all results should be available really soon. The get method with timeout let the scheduler the chance to assign a time slice to the last thread still waiting in the finally clause:

    for (SubmittedJob job : submittedJobs) {
        try {
            ProcessedResponse response = null;
            try {
                // Try to get answer in short timeout, should be available
                response = job.getFuture().get(10, TimeUnit.MILLISECONDS);
            } catch (TimeoutException te) {
                job.getFuture().cancel(true);
                continue;
            }
            results.add(response);
        } catch (ExecutionException cause) {
            // exceptions occurred during execution, in any
        } catch (InterruptedException e) {
            // take care
        }
    }

A remark: your code is not realistic as the getAndProcessResponse method ends in less than a milliseconds. With a random sleep there, the race condition does not come out so often.



回答4:

I second the opinions about race conditions. I'd suggest forget about the latch and use java.util.concurrent.ThreadPoolExecutor.awaitTermination(long, TimeUnit)