Why does this parallel algorithm run more slowly t

Sequential:

void do(List<D> d, final List<C> c) {
for (D datum : d)
    getChampoid(datum, c).tally(datum);

Parallel:

static final int procs = Runtime.getRuntime().availableProcessors();
static final ExecutorService pool = Executors.newFixedThreadPool(procs);
void do(List<D> d, final List<C> c) {
    List<Future> futures = new ArrayList<>();
    for (final D datum : d)
        futures.add(pool.submit(new Runnable() {

            @Override
            public void run() {
                getChampoid(datum, c).tally(datum);
            }

        }));
    for (Future f : futures)
        try {
            f.get();
        } catch (InterruptedException e) {
            e.printStackTrace();
        } catch (ExecutionException e) {
            e.printStackTrace();
        }

I'm stumped because to me they look like they do the exact same thing, the parallel version should just be faster, but it's an order of magnitude slower. Any thoughts?

FYI both d and c are huge lists with somewhere between thousands and hundreds of thousands of items.

标签： java concurrency parallel-processing

2条回答

The star\"

2楼-- · 2020-05-07 11:00

A) have you ruled out any synchronization? If getChampoid hits on a synchronized data structure (such as a Hashtable), i'm not surprised performance drops drastically.

B) with Java, Object overhead is usually what kills performance. Make sure to use profilers such as VisualVM. In your case, I would not be surprised if you see a lot of time going into garbage collection.

Consider breaking your list d into a small number of parts, then processing each part in a thread. Right now, for every datum you will create a Runnable and a Future object, and probably some more. Which, when d is large, means a lot of work for the garbage collector (and here, it cannot be optimized away by Hotspot either).

0人赞添加讨论(0) 举报

劳资没心，怎么记你

3楼-- · 2020-05-07 11:13

the parallel version should just be faster,

That is a common misconception.

but it's an order of magnitude slower.

A common result

Any thoughts?

Using multiple threads has overhead which one thread doesn't have. The overhead can be orders of magnitude higher than the work actually done, making it much slower. If the work done is much larger than the overhead you can get very good scalability.

e.g. say it costs about 10 micro-seconds to pass a task to another thread. If your task takes 1 micro-second, the overhead can kill your performance. However, if the task takes 100 micro-seconds, you could see a significant performance improvement and worth paying the price of the overhead.

In short, nothing is free esp not multiple thread.

0人赞添加讨论(0) 举报

Why does this parallel algorithm run more slowly t

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间