I have migrated an old cvs repository with cvs2git (cvs2svn). The resulted dump file is now 72GB big and my trials to import the dump via git fast-import always fail because of an out-of-memory error:
fatal: Out of memory, malloc failed (tried to allocate 6196691 bytes)
fast-import: dumping crash report to fast_import_crash_13097
error: git-fast-import died of signal 11
Whereby my System has 32GB RAM and 50GB swap. I am running the import on a Red Hat 5.3 with Git 1.8.3.4 (gcc44, python2.6.8, cvs2svn2.4.0). I have also tried to unlimit stack size and file descriptors, but the memory error is still there.
Has anybody any idea?
The idea is to:
- split the cvs repo (each repo should represent a "coherent component")
- clean up its content (any big binary that can be regenerated, or stored elsewhere in an artifact referential, should be left out of the cvs repo), since git doesn't deal well with large files.
Then you would import the cvs (sub-)repos into individual git repos.
Since git is distributed, and not centralized, you want to keep the size of each git repo reasonable.
I also had faced the same issue but it is solved now. Please download the latest cvs2svn which has the fix to reduce the size of the dump considerably. It reduces the metadata for symbol commits.Version is cvs2git version 2.5 or later.
(You can view the change made in https://github.com/mhagger/cvs2svn/commit/fd177d0151f00b028b4a0df21e0c8b7096f4246b)