QUESTION
How can I run different instances of the same program in different cores at 100%?
CONTEXT
I am running a C++11 code in an iMac Pro (2017) with OS High Sierra 10.13.6. The corresponding executable is called 'bayesian_estimation'.
When I run one instance of this program, one of the cores is doing that task at 100%, as you can see here:
If I run more instances, the CPU% of each of them goes down. But most of the cores remain idle! Why are not they being used? See, for example, what happens when 3 'bayesian_estimation' processes are running:
Or when I execute 7:
Ideally in the last picture, I would like to have 7 cores completely busy, each of them running one 'bayesian_estimation' process.
EDIT 1
I proceed to give more information that might help to identify the problem. I compiled my code as follows:
g++ -std=c++11 -Wall -g bayesian_estimation.cpp -o bayesian_estimation -O2 -larmadillo
And all libraries and packages that I have used are the following:
#include <iostream> // Standard input and output functions.
#include <iomanip> // Manipulate stream input and output functions.
#include <armadillo> // Load Armadillo library.
#include <sys/stat.h> // To obtain information from files (e.g., S_ISDIR).
#include <dirent.h> // Format of directory entries.
#include <vector> // To deal with vectors.
I identified the origin of the bottleneck that @bolov mentions in the comments. It arises due to the use of
arma_rng::set_seed_random()
in the code to generate random numbers with the Armadillo library. If I remove that line of the code, the problem is gone.A question going deeper into this issue and providing with a reproducible example is posted here.