Java 8 parallel streams don't appear to actual

I'm trying to use Java 8's parallelStream() to execute several long-running requests (eg web requests) in parallel. Simplified example:

List<Supplier<Result>> myFunctions = Arrays.asList(() -> doWebRequest(), ...)

List<Result> results = myFunctions.parallelStream().map(function -> function.get()).collect(...

So if there are two functions that block for 2 and 3 seconds respectively, I'd expect to get the result after 3 seconds. However, it really takes 5 seconds - ie it seems the functions are being executed in sequence and not in parallel. Am I doing something wrong?

edit: This is an example. The time taken is ~4000 milliseconds when I want it to be ~2000.

    long start = System.currentTimeMillis();

    Map<String, Supplier<String>> input = new HashMap<String, Supplier<String>>();

    input.put("1", () -> {
        try {
            Thread.sleep(2000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        return "a";
    });

    input.put("2", () -> {
        try {
            Thread.sleep(2000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        return "b";
    });

    Map<String, String> results = input.keySet().parallelStream().collect(Collectors.toConcurrentMap(
            key -> key,
            key -> {
                return input.get(key).get();
            }));

    System.out.println("Time: " + (System.currentTimeMillis() - start));

}

Doesn't make any difference if I iterate over the entrySet() instead of the keySet()

edit: changing the parallel part to the following also does not help:

 Map<String, String> results = input.entrySet().parallelStream().map(entry -> {
            return new ImmutablePair<String, String>(entry.getKey(), entry.getValue().get());
    }).collect(Collectors.toConcurrentMap(Pair::getLeft, Pair::getRight));

标签： java concurrency parallel-processing java-8

1条回答

冷血范

2楼-- · 2019-09-16 06:05

When executing in parallel, there is overhead of decomposing the input set, creating tasks to represent the different portions of the calculation, distributing the actions across threads, waiting for results, combining results, etc. This is over and above the work of actually solving the problem. If a parallel framework were to always decompose problems down to a granularity of one element, for most problems, these overheads would overwhelm the actual computation and parallelism would result in a slower execution. So parallel frameworks have some latitude to decide how finely to decompose the input, and that's what's happening here.

In your case, your input set is simply too small to be decomposed. So the library chooses to execute sequentially.

Try this on your four-core system: compare

IntStream.range(0, 100_000).sum()

IntStream.range(0, 100_000).parallel().sum()

Here, you're giving it enough input that it will be confident it can win through parallel execution. If you measure with a responsible measurement methodology (say, the JMH microbenchmark harness), you'll probably see an almost-linear speedup between these two examples.

0人赞添加讨论(0) 举报

Java 8 parallel streams don't appear to actual

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间