I am training my neural network using tensorflow on CentOS HPC. However I got this error at start of the training process:
OMP: Error #15: Initializing libiomp5.so, but found libiomp5.so already initialized. OMP: Hint: This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
The code is for instance segmentation and it worked fine for many people, but failed in my case.
Why it occurs? How to solve it?
Simply downgrading my version of TensorFlow using Anaconda did it for me.
I solved this problem by asking a HPC server expert. Maybe useful for Compute Canada system users.
Why it occurs?
This error is due to conflict between a tensorflow pre-built Python wheel(which is specific for Compute Canada system) and conda environment. Quote : "conda is always a bit problematic because it downloads precompiled binaries, mileage may vary..."
How to solve it?
As @abccd pointed out "The best thing to do is to ensure that only a single OpenMP runtime is linked into the process". However, I haven't figured out how to ensure that.
So I uninstalled conda, and install everything in module system using pip install. Then the network works fine.