How to get ideal number of threads in parallel pro

2020-02-12 03:51发布

I need to get an ideal number of threads in a batch program, which runs in batch framework supporting parallel mode, like parallel step in Spring Batch.

As far as I know, it is not good that there are too many threads to execute steps of a program, it may has negative effect to the performance of the program. Some factors could arise performance degradation(context switching, race condition when using shared resources(locking, sync..) ... (are there any other factors?)).

Of course the best way of getting the ideal number of threads is for me to have actual program tests adjusting the number of threads of the program. But in my situation, it is not that easy to have the actual test because many things are needed for the tests(persons, test scheduling, test data, etc..), which are too difficult for me to prepare now. So, before getting the actual tests, I want to know the way of getting a guessable ideal number of threads of my program, as best as I can. What should I consider to get the ideal number of threads(steps) of my program?? number of CPU cores?? number of processes on a machine on which my program would run?? number of database connection?? Is there a rational way such as a formula in a situation like this?

3条回答
The star\"
2楼-- · 2020-02-12 04:28

What should I consider to get the ideal number of threads(steps) of my program?? number of CPU cores?? number of processes on a machine on which my program would run?? number of database connection?? Is there a rational way such as a formula in a situation like this?

This is tremendously difficult to do without a lot of knowledge over the actual code that you are threading. As @Erwin mentions, IO versus CPU-bound operations are the key bits of knowledge that are needed before you can determine even if threading an application will result is any improvements. Even if you did manage to find the sweet spot for your particular hardware, you might boot on another server (or a different instance of a virtual cloud node) and see radically different performance numbers.

One thing to consider is to change the number of threads at runtime. The ThreadPoolExecutor.setCorePoolSize(...) is designed to be called after the thread-pool is in operation. You could expose some JMX hooks to do this for you manually.

You could also allow your application to monitor the application or system CPU usage at runtime and tweak the values based on that feedback. You could also keep AtomicLong throughput counters and dial the threads up and down at runtime trying to maximize the throughput. Getting that right might be tricky however.

I typically try to:

  • make a best guess at a thread number
  • instrument your application so you can determine the effects of different numbers of threads
  • allow it to be tweaked at runtime via JMX so I can see the affects
  • make sure the number of threads is configurable (via system property maybe) so you don't have to rerelease to try different thread numbers
查看更多
唯我独甜
3楼-- · 2020-02-12 04:31

The most important consideration is whether your application/calculation is CPU-bound or IO-bound.

  • If it's IO-bound (a single thread is spending most of its time waiting for external esources such as database connections, file systems, or other external sources of data) then you can assign (many) more threads than the number of available processors - of course how many depends also on how well the external resource scales though - local file systems, not that much probably.
  • If it's (mostly) CPU bound, then slightly over the number of available processors is probably best.
查看更多
够拽才男人
4楼-- · 2020-02-12 04:35

General Equation:

Number of Threads <= (Number of cores) / (1 - blocking factor)

Where 0 <= blocking factor < 1

Number of Core of a machine : Runtime.getRuntime().availableProcessors()

Number of Thread you can parallelism, you will get by printing out this code :

ForkJoinPool.commonPool()

And the number parallelism is Number of Core of your machine - 1. Because that one is for main thread.

Source link

Time : 1:09:00

查看更多
登录 后发表回答