How can I trigger garbage collection on a Git remo

2019-01-17 03:12发布

问题:

As we know, we can periodically run git gc to pack objects under .git/objects.

In the case of a remote central Git repository (bare or not), though, after many pushes, there many files under myproj.git/objects; each commit seems to create a new file there.

How can I pack that many files? (I mean the ones on the remote central bare repository, not on local clone repository.)

回答1:

The remote repo should be configured to run gc as needed after a commit is made. See the documentation of gc.auto in git-gc and git-config man pages.

However, a remote repo shouldn't need all that much garbage collection, since it will rarely have dangling (unreachable) commits. These usually result from things like branch deletion and rebasing, which typically happen only in local repos.

So gc is needed more for repacking, which is for saving storage space rather than removing actual garbage. The gc.auto variable is sufficient for taking care of this.



回答2:

While you should have some process that takes care of this periodically, automatically, it's no problem run

git gc

on a bare repository

git@domU:/pix/git/repositories/abd.git$ ls -l

total 28
drwxrwxr-x   2 git git    6 2010-06-06 02:44 branches
-rw-rw-r--   1 git git   66 2010-06-06 02:44 config
-rw-r--r--   1 git git   23 2011-03-15 18:19 description
-rw-rw-r--   1 git git   23 2010-06-06 02:44 HEAD
drwxrwxr-x   2 git git 4096 2010-06-06 02:44 hooks
drwxrwxr-x   2 git git   20 2010-06-06 02:44 info
drwxrwxr-x 260 git git 8192 2010-09-01 00:26 objects
drwxrwxr-x   4 git git   29 2010-06-06 02:44 refs

$ git gc
Counting objects: 3833, done.
Compressing objects:  31% (1085/3500)...


回答3:

This question should shed some light on how often you should run garbage collection.

The easiest option would be to use a scheduled task in windows or a cron job in Unix to run git gc periodically. This way you don't even need to think about it.



回答4:

after many pushes, there many files under myproj.git/objects

There won't be as much with git 2.11+ (Q4 2016) and a pre-receive hook.
In that scenario, you won't have to trigger a git gc at all.

See commit 62fe0eb, commit e34c2e0, commit 722ff7f, commit 2564d99, commit 526f108 (03 Oct 2016) by Jeff King (peff).
(Merged by Junio C Hamano -- gitster -- in commit 25ab004, 17 Oct 2016)

receive-pack: quarantine objects until pre-receive accepts

In order for the receiving end of "git push" to inspect the received history and decide to reject the push, the objects sent from the sending end need to be made available to the hook and the mechanism for the connectivity check, and this was done traditionally by storing the objects in the receiving repository and letting "git gc" to expire it.

Instead, store the newly received objects in a temporary area, and make them available by reusing the alternate object store mechanism to them only while we decide if we accept the check, and once we decide, either migrate them to the repository or purge them immediately.

That temporary area will be set by the new environment variable GIT_QUARANTINE_ENVIRONMENT.

That way, if a (big) push is rejected by a pre-receive hook, those big objects won't be laying around for 90 days waiting for git gc to clean them up.