I have an issue with the generation of makefiles stage of CMake being slow which is similar to this unanswered question:
CMake is slow to generate makefiles
My project is made up of a top level CMakeLists.txt
file which uses add_subdirectory()
to add various subprojects for individual library and executable components.
For a given component, the CMakeLists.txt
file contains something like:
add_library(mylib SHARED
sourceFile1.cpp
sourceFile2.cpp
...
)
I can build just the contents of that directory using:
make mylib
If I modify the CMakeLists.txt
file in the sub-directory (which I've been doing a lot as part of a migration from pure Makefiles to CMake) then run make
it correctly re-runs CMake to update the configuration as if I'd run make rebuild_cache
.
However, I notice that it in fact reconfigures the entire project. I really want for CMake to be clever enough to know it only needs to regenerate the Makefile in the current directory and sub-directories.
Is there a better way to structure a CMake project to achieve this? I see some people use project() for each CMakeLists.txt in each sub-project. In general, is this a good idea?
Alternatively/additionally is there some way to speed up the generation step of CMake? (currently I have 60s+)
Bonus points if you want to discuss why CMake itself should or shouldn't be able to run in parallel (imagine a cmake -j
).
I've added the meson-build tag as a modest bounty, but alone it hasn't yet attracted enough attention to warrant an answer. It's this kind of problem that might cause people to switch to build systems to meson-build (assuming it doesn't have similar problems) or something similar.
It is possible that the correct answer is it can't be done without modifying the source to CMake. To earn the bounty though I require an explanation in terms of how CMake works and/or where it is flawed.
Clarification: It is the generation step that is slow in my case. The configure itself is quick enough, but CMake hangs for a quite a while between outputting "-- Configuring done" and "-- Generating done".
For a full cache rebuild I run:
make -n rebuild_cache
Running CMake to regenerate build system... using Makefile generator -- FOOBAR_VERSION: 01.02.03 -- Configuring done -- Generating done -- Build files have been written to: /home/brucea/work/depot/emma/main/cmake real 74.87 user 1.74 sys 1.02
Under the hood this runs:
cmake -H<source dir> -B<build dir>
I presume -B
is a synonym for --build
. Neither option is described correctly in the documentation. -H
is the root of the source directory (not the same as --help
as the documentation would have you believe).
It's fast to get to the output of "Configuring done", but slow from there:
For example,
15:44:14 execve("/usr/local/bin/cmake", >grep Generating cmake_strace.log >grep "Configuring" cmake_strace.log 15:44:15 write(1, "-- Configuring done\n", 20-- Configuring done 15:45:01 write(1, "-- Generating done\n", 19-- Generating done >grep "Build files" cmake_strace.log 15:45:22 write(1, "-- Build files have been written"..., 77-- Build files have been written to:
If editing a single CMakeLists.txt file in a subdirectory, and
then running make -n
, it runs:
cd /home/project/cmake && /usr/local/bin/cmake -H/home/project/cmake -B/home/project/cmake --check-build-system CMakeFiles/Makefile.cmake 0
--check-build-system is another undocumented option.
The effect is the same - regenerate the whole build system, not just the current subtree. There is no difference in behaviour between an in-source and an out-of-source build.
If I run a trace, e.g.:
strace -r cmake --trace -H/home/project/cmake -B/home/project/cmake 2>&1 | tee cmake_rebuild_cache.log
sort -r cmake_rebuild_cache.log | uniq
The majority of time spent seems to be spent on (or between) open, access & unlink calls.
The length of each task is quite variable, but the huge number of them builds up. I have no idea what the Labels.json and Labels.txt files are about (something internal to CMake).
One run:
49.363537 open("/home/projectbar/main/test/foo2bar/CMakeFiles/test2.foo2bar.testViewingSource1.dir/build.make", O_RDONLY) = 5 1.324777 access("/home/projectbar/main/test/performance/CMakeFiles/performancetest.chvcthulhu.testChvcthulhuPerformance2.dir", R_OK) = 0 0.907807 access("/home/projectbar/main/test/foo2bar/CMakeFiles/test2.foo2bar.testPeripheralConnection2.dir", R_OK) = 0 0.670272 unlink("/home/projectbar/main/src/foo2bar/Foo2Bar/CMakeFiles/foo2bar_lib.dir/progress.make.tmp") = -1 ENOENT (No such file or directory) 0.600272 access("/home/projectbar/main/test/foo2bar/testFilesModel2.ok", R_OK) = 0 0.599010 access("/home/projectbar/main/test/hve2snafu/testInvalidByte2c.ok", R_OK) = 0 0.582466 read(5, "openjdk version \"1.8.0_71\"\nOpenJ"..., 1024) = 130 0.570540 writev(3, [{"# CMAKE generated file: DO NOT E"..., 8190}, {"M", 1}], 2) = 8191 0.553576 close(4) = 0 0.448811 unlink("/home/projectbar/main/test/snafu2hve/CMakeFiles/test2.snafu2hve.testNoProbes2.dir/progress.make.tmp") = -1 ENOENT (No such file or directory) 0.431559 access("/home/projectbar/main/src/foo2bar/Foo2Bar/CMakeFiles/foo2bar_lib.dir", R_OK) = 0 0.408003 unlink("/home/projectbar/main/test/lachesis/CMakeFiles/test2.lachesis.testBadSequenceNumber1.dir/progress.make.tmp") = -1 ENOENT (No such file or directory) 0.407120 write(4, "# The set of languages for which"..., 566) = 566 0.406674 write(3, "# CMAKE generated file: DO NOT E"..., 675) = 675 0.383892 read(3, "ewingPeriod.cpp.o -c /home/bruce"..., 8191) = 8191 0.358490 unlink("/home/projectbar/main/cmake/CMakeFiles/mklinks.chvdiff.dir/progress.make.tmp") = -1 ENOENT (No such file or directory)
Another run of the same command:
2.009451 unlink("/home/projectbar/main/cmake/CMakeFiles/mklinks.lachesis.dir/Labels.json") = -1 ENOENT (No such file or directory) ) = 20 ) = 19 1.300387 access("/home/projectbar/main/test/chvedit/CMakeFiles/test2.chvedit.tefooultiMatchFactoringEdit2.dir", R_OK) = 0 1.067957 access("/home/projectbar/main/test/chvedit/CMakeFiles/test2.chvedit.tefooultiMatchFactoringEdit2.dir", R_OK) = 0 ) = 1 0.885854 unlink("/home/projectbar/main/src/gorkyorks2bar/CMakeFiles/doxygen.correct.gorkyorks2bar.dir/Labels.json") = -1 ENOENT (No such file or directory) 0.854539 access("/home/projectbar/main/test/reportImpressions/ReportImpressions/CMakeFiles/testsuite1_reportImpressions.dir", R_OK) = 0 0.791741 unlink("/home/projectbar/main/cmake/CMakeFiles/mklinks.bar_models.dir/progress.make.tmp") = -1 ENOENT (No such file or directory) 0.659506 unlink("/home/projectbar/main/cmake/CMakeFiles/mklinks.dir/progress.make.tmp") = -1 ENOENT (No such file or directory) 0.647838 unlink("/home/projectbar/main/test/libyar/YarModels/CMakeFiles/testsuite1_yarmodels.dir/Labels.txt") = -1 ENOENT (No such file or directory) 0.620511 unlink("/home/projectbar/main/test/libyar/YarModels/CMakeFiles/testsuite1_yarmodels.dir/Labels.json") = -1 ENOENT (No such file or directory) 0.601942 unlink("/home/projectbar/main/cmake/CMakeFiles/mklinks.lachesis.dir/Labels.txt") = -1 ENOENT (No such file or directory) 0.591871 access("/home/projectbar/main/src/runbardemo/simple_demo/CMakeFiles", R_OK) = 0 0.582448 write(3, "CMAKE_PROGRESS_1 = \n\n", 21) = 21 0.536947 write(3, "CMAKE_PROGRESS_1 = \n\n", 21) = 21 0.499758 unlink("/home/projectbar/main/test/foo2bar/CMakeFiles/test2.foo2bar.testInputDirectory1.dir/progress.make.tmp") = -1 ENOENT (No such file or directory) 0.458120 unlink("/home/projectbar/main/test/yak2dcs/CMakeFiles/test2.yak2dcs.testsuite2.dir/progress.make.tmp") = -1 ENOENT (No such file or directory) 0.448104 unlink("/home/projectbar/main/test/reportImpressions/CMakeFiles/test2.reportImpressions.dir/progress.make.tmp") = -1 ENOENT (No such file or directory) 0.444344 access("/home/projectbar/main/src/bananas/CMakeFiles/bin.bananas.dir", R_OK) = 0 0.442685 unlink("/home/projectbar/main/test/rvedit/CMakeFiles/test2.rvedit.tefooissingOptionValue.dir/progress.make.tmp") = -1 ENOENT (No such file or directory) 0.425604 unlink("/home/projectbar/main/test/listdcs/CMakeFiles/test2.listdcs.testListCalls5.dir/progress.make.tmp") = -1 ENOENT (No such file or directory) 0.391163 access("/home/projectbar/main/src/siedit/CMakeFiles/siedit.dir", R_OK) = 0 0.362171 access("/home/projectbar/main/test/foo2bar/CMakeFiles/test2.foo2emma.testHowResults6.dir", R_OK) = 0
Note the Ninja generator is much faster (though still not brilliant). For example,
/usr/bin/time -p ninja rebuild_cache
ninja: warning: multiple rules generate ../src/ams2yar/ams2yar. builds involving this target will not be correct; continuing anyway [-w dupbuild=warn] ninja: warning: multiple rules generate ../src/vox/vox. builds involving this target will not be correct; continuing anyway [-w dupbuild=warn] ninja: warning: multiple rules generate ../src/bananas/bananas. builds involving this target will not be correct; continuing anyway [-w dupbuild=warn] ninja: warning: multiple rules generate ../src/fidlertypes2fidlerinfo/fidlertypes2fidlerinfo. builds involving this target will not be correct; continuing anyway [-w dupbuild=warn] ninja: warning: multiple rules generate ../src/mkrundir/mkrundir. builds involving this target will not be correct; continuing anyway [-w dupbuild=warn] ninja: warning: multiple rules generate ../src/runyar/runyar. builds involving this target will not be correct; continuing anyway [-w dupbuild=warn] ninja: warning: multiple rules generate ../src/runyardemo/runyardemo. builds involving this target will not be correct; continuing anyway [-w dupbuild=warn] [1/1] Running CMake to regenerate build system... Generator=Ninja -- FOO_VERSION: 01.02.03 -- Configuring done -- Generating done -- Build files have been written to: /home/project/cmake/build real 12.67 user 1.01 sys 0.31
Note that the project is not quite ready for Ninja yet as there are errors like:
ninja: warning: multiple rules generate ../src/runfoobardemo/runfoobardemo. builds involving this target will not be correct; continuing anyway [-w dupbuild=warn]
and
ninja: error: dependency cycle: ../src/foobar -> ../src/foobar/CMakeFiles/foobar -> ../src/ams2emma/foobar
to be resolved. This question is really about why the Makefile generator is slow. I'm not sure if the problems Ninja shows are useful hints here or red herrings.
Building CMake with more optimisations does not help.
Based on my trace output it and the output of time, it is unlikely that it would. The user time and therefore time spend within the CMake code itself is quite low. (see e.g. What do 'real', 'user' and 'sys' mean in the output of time(1)?).
Here's what I tried for completeness:
export CXX_FLAGS="-O3 -ftree-vectorise -msse2"
cmake -DCMAKE_BUILD_TYPE=RELEASE
Actually using a more optimised CMake does make the configure part faster, but in my case it is the generate part that is slow. It would appear from the timing that this step is somehow I/O bound.
I decided to investigate Florian's idea that using memory insteam of file stream for temporary files might make a difference.
I decided to try the easy route and hacked CMake to write .tmp files to a RAM disk instead.
I then went the whole hog and tried generating the build system on the RAM disk:
sudo mkdir /mnt/ramdisk
sudo mount -t tmpfs -o size=512m tmpfs /mnt/ramdisk
/usr/bin/time -p cmake -H/<source> -B/mnt/ramdisk/build
I was very surprised to find this makes no difference at all to the wall clock time:
real 59.61
user 1.55
sys 0.62
>du -sh /mnt/ramdisk/build/
4.4M /mnt/ramdisk/build/
Similarly with ramfs:
real 51.09
user 1.58
sys 0.50
What could be happening here? I was guessing sub-processes, but I can't work out which sub-processes are consuming the wall clock time if any of them are. They look to be very short lived.
For completeness, here is some output from perf (CMake built with -fno-omit-frame-pointer
):
perf record -g --call-graph dwarf cmake -H<source> -B<build>
perf report -g graph
Samples: 17K of event 'cycles', Event count (approx.): 14363392067 Children Self Command Shared Object Symbol + 65.23% 0.00% cmake cmake [.] do_cmake + 65.02% 0.00% cmake cmake [.] cmake::Run + 60.32% 0.00% cmake cmake [.] main + 59.82% 0.00% cmake libc-2.17.so [.] __libc_start_main + 57.78% 0.00% cmake cmake [.] _start + 55.04% 0.00% cmake cmake [.] cmGlobalUnixMakefileGenerator3::Generate + 54.56% 0.00% cmake cmake [.] cmake::Generate + 49.90% 0.00% cmake cmake [.] cmGlobalGenerator::Generate + 38.87% 0.02% cmake cmake [.] cmLocalUnixMakefileGenerator3::Generate + 18.65% 0.01% cmake cmake [.] cmMakefileTargetGenerator::WriteTargetBuildRules + 17.05% 0.02% cmake cmake [.] cmMakefile::ExecuteCommand + 16.99% 0.01% cmake cmake [.] cmMakefile::ReadListFile + 16.84% 0.01% cmake cmake [.] cmCommand::InvokeInitialPass + 16.79% 0.00% cmake cmake [.] cmMakefile::Configure + 14.71% 0.00% cmake cmake [.] cmMakefile::ConfigureSubDirectory + 14.67% 0.05% cmake cmake [.] cmMacroHelperCommand::InvokeInitialPass + 14.27% 0.02% cmake cmake [.] cmMakefileUtilityTargetGenerator::WriteRuleFiles + 13.91% 0.00% cmake cmake [.] cmGlobalGenerator::Configure + 13.50% 0.05% cmake cmake [.] cmOutputConverter::Convert + 13.48% 0.00% cmake cmake [.] cmAddSubDirectoryCommand::InitialPass + 13.46% 0.00% cmake cmake [.] cmMakefile::AddSubDirectory + 12.91% 0.00% cmake cmake [.] cmGlobalUnixMakefileGenerator3::Configure + 12.82% 0.00% cmake cmake [.] cmake::ActualConfigure + 10.90% 0.00% cmake cmake [.] cmake::Configure + 10.55% 0.02% cmake cmake [.] cmMakefileTargetGenerator::WriteObjectRuleFiles + 10.35% 0.09% cmake cmake [.] cmLocalUnixMakefileGenerator3::WriteMakeRule + 9.76% 0.03% cmake cmake [.] cmMakefileTargetGenerator::WriteObjectBuildFile + 7.97% 0.00% cmake cmake [.] cmMakefileLibraryTargetGenerator::WriteRuleFiles + 7.93% 0.00% cmake cmake [.] cmMakefileExecutableTargetGenerator::WriteRuleFiles + 7.88% 0.00% cmake cmake [.] cmLocalUnixMakefileGenerator3::WriteLocalMakefile + 7.68% 0.02% cmake [kernel.kallsyms] [k] sysret_audit + 7.60% 0.05% cmake [kernel.kallsyms] [k] __audit_syscall_exit + 7.40% 0.08% cmake cmake [.] cmsys::SystemTools::CollapseFullPath
And perf report -g graph -no-children:
+ 2.86% cmake libc-2.17.so [.] _int_malloc + 2.15% cmake libc-2.17.so [.] __memcpy_ssse3_back + 2.11% cmake [kernel.kallsyms] [k] find_next_bit + 1.84% cmake libc-2.17.so [.] __memcmp_sse4_1 + 1.83% cmake libc-2.17.so [.] _int_free + 1.71% cmake libstdc++.so.6.0.20 [.] std::__ostream_insert > + 1.18% cmake libstdc++.so.6.0.20 [.] std::basic_string, std::allocator >::~basic_string + 1.13% cmake libc-2.17.so [.] malloc + 1.12% cmake cmake [.] cmOutputConverter::Shell__ArgumentNeedsQuotes + 1.11% cmake libstdc++.so.6.0.20 [.] std::string::compare + 1.08% cmake libc-2.17.so [.] __strlen_sse2_pminub + 1.05% cmake cmake [.] std::string::_S_construct + 1.04% cmake cmake [.] cmsys::SystemTools::ConvertToUnixSlashes + 0.97% cmake cmake [.] yy_get_previous_state + 0.87% cmake cmake [.] cmOutputConverter::Shell__GetArgument + 0.76% cmake libstdc++.so.6.0.20 [.] std::basic_filebuf >::xsputn + 0.75% cmake libstdc++.so.6.0.20 [.] std::string::size + 0.75% cmake cmake [.] cmOutputConverter::Shell__SkipMakeVariables + 0.74% cmake cmake [.] cmOutputConverter::Shell__CharNeedsQuotesOnUnix + 0.73% cmake [kernel.kallsyms] [k] mls_sid_to_context + 0.72% cmake libstdc++.so.6.0.20 [.] std::basic_string, std::allocator >::basic_string + 0.71% cmake cmake [.] cmOutputConverter::Shell__GetArgumentSize + 0.65% cmake libc-2.17.so [.] malloc_consolidate + 0.65% cmake [kernel.kallsyms] [k] mls_compute_context_len + 0.65% cmake cmake [.] cmOutputConverter::Shell__CharNeedsQuotes + 0.64% cmake cmake [.] cmSourceFileLocation::Matches + 0.58% cmake cmake [.] cmMakefile::ExpandVariablesInStringNew + 0.57% cmake cmake [.] std::__deque_buf_size + 0.56% cmake cmake [.] cmCommandArgument_yylex + 0.55% cmake cmake [.] std::vector >::size + 0.54% cmake cmake [.] cmsys::SystemTools::SplitPath + 0.51% cmake libstdc++.so.6.0.20 [.] std::basic_streambuf >::xsputn
There are so many aspects that define CMake's configuration and generation steps duration (besides what you actually do in your CMakeLists.txt files; it's e.g. your host system, your toolchain and which CMake version/distribution you are using).
So I try to concentrate on the specific questions you have.
Rebuild/rewrite makefiles for just a subdirectory?
For the start: Using
add_subdirectory()
is good for structuring your CMake code. But you have to keep in mind that you can always change global CMake properties in a subdirectory and that targets inside those subdirectories can have cross-dependencies.So what does CMake do (considering the "I have touched one
CMakeLists.txt
file in a subdirectory" case discussed here):CMakeLists.txt
file is changed it goes through the complete hierarchy ofCMakeLists.txt
files again and rebuilds the build environment again in memory.cmGeneratedFileStreamBase::Close()
).This behavior is necessary because any makefile can change even when only a subdirectory's
CMakeLists.txt
file has changed and it was optimized to prevent unnecessary rebuilds during the actualmake
step (from touched makefiles).Is there some way to speed up the generation step of CMake?
So yes, it does temporarily rewrite all makefiles (which could be slow) and no, you can't minimize this while using
add_subdirectory()
to only the changed subdirectory.Maybe one possible performance optimization for the future in CMake's own code would be to use memorystreams instead of filestreams for the temporary files.
@BruceAdams tested this by using a RAM disk for the generated makefile environment with no effect.
And yes, the CMake generated
cmake_check_build_system
rule does almost the same as therebuild_cache
rule and yes the used-B
,-H
and--check-build-system
options are CMake internal command line options and therefore undocumented (even if often referred to on Stack Overflow, e.g. in one of my answers here).What helped me to speed-up the configuration/generation was to rebuild CMake itself with a lot more optimization options than the normal distributions and using a 64-bit toolchain instead of the 32-bit versions currently still distributed.
Here are some test results (using the CMake test script found below with 100 subdirectories/libraries) on my Windows PC always using the same MSYS environment, but different CMake compilations of the same CMake source code:
Official CMake 3.2.2 version:
Using
mingw32
andGNU 4.8.1
I rebuild CMake 3.2.2 withand got
And the same with my antivirus software turned off:
Using
mingw-w64
andGNU 5.3.0
I rebuild CMake 3.2.2 withand got
And the same with my antivirus software turned off:
To summarize I see two main influences:
1st: The configuration step can be speed-up by going for a 64-bit version and optimize for your processor platform (you would certainly have to find a common base
-march=...
or-mtune=...
for all your project's build PCs).2nd: The generation step can mostly be sped up by searching for possible file I/O bottlenecks outside of CMake. In my case telling the antivirus software to not check the toolchain and build directories each time I read/write to those was really speeding up things.
Remark: I confirmed @BruceAdams test results that the the compiler's auto-vectorization (default for
-O3
or-Ofast
) is not able to do much about CMake source code's ability to run in multiple processes/on multiple cores.Is there a better way to structure a CMake project to achieving this?
Yes, if you e.g. know that a certain sub-tree of your CMake script code just generates a library and has no dependencies, you could put that part in an external project by using
ExternalProject_Add()
. And yes, having had similar concerns regarding large CMake projects, this is seen as a good "modern CMake" practice (see also references below).References
What I used to reproduce your problem
Just for completeness and if someone wants to check those numbers against his/her own, here is my test code:
And then - after the first
cmake ..
andmake
calls - doing: