Stopping a git gc --aggressive, is that a bad thin

2019-02-07 17:26发布

问题:

I am running a git gc --aggressive on a very large repo (apx 100 gb). It's been running since two nights ago, and as of a couple hours, it has been stuck on: "Compressing Objects: 99% (76496/76777)"

If I Ctrl-C the process, what are the consequences? Will my repo be unusable? My intuition says no, but I'd like some opinions. Thanks!

回答1:

git is supposed to be always safe from interruptions like this. If you are worried, though, I suggest Ctrl+Z and then run a git fsck --full to make sure the system is consistent.

There are a number of git-config variables which might help your git-gc go faster. I use the following on one particular large repo, but there are many more options to randomly try (or carefully study, whichever).

git config pack.threads 1
git config pack.deltaCacheSize 1
git config core.packedGitWindowSize 16m
git config core.packedGitLimit 128m
git config pack.windowMemory 512m

These only help if your problem is that you are running out of memory.



回答2:

FWIW, I just corrupted a repository by aborting git gc with CTRL+C. git fsck now shows the following errors:

error: HEAD: invalid sha1 pointer [...]
error: refs/heads/master does not point to a valid object!
notice: No default references

And quite a few

dangling commit [...]

I'm not going to investigate on this, but I would like to point out that I'm going to avoid aborting git gc.



回答3:

Note: there is an interesting evolution for git 2.0 (Q2 2014):

"git gc --aggressive" learned "--depth" option and "gc.aggressiveDepth" configuration variable to allow use of a less insane depth than the built-in default value of 250.

This is described in commit 125f814, done by Nguyễn Thái Ngọc Duy (pclouds):

When 1c192f3 (gc --aggressive: make it really aggressive - 2007-12-06) made --depth=250 the default value, it didn't really explain the reason behind, especially the pros and cons of --depth=250.

An old mail from Linus below explains it at length.
Long story short, --depth=250 is a disk saver and a performance killer.
Not everybody agrees on that aggressiveness.
Let the user configure it.

That could help avoiding the "freeze" issue you have when running that command on large repos.



标签: git git-gc