Does Git never delete any information?

2019-02-24 07:35发布

问题:

I have read this post: http://jenkins-ci.org/content/summary-report-git-repository-disruption-incident-nov-10th

which describes an incident where a user accidentally triggered a git push --force from an outdated state of the repository.

Now of course this requires some clean up to restore the original state branches. But since Git never deletes information as I understand this clean up process is always possible.

So even though you rebase, push --force (and other operations that might rewrite the history) the original commits are still there they just needs to be found right?

In short are the any (destructive) operation in git that actually deletes data?

回答1:

Commits that are not reachable via any reference will eventually be deleted, which includes the reflog. The default time period for this to happen is pretty conservative. There are several options you can use to adjust this via git config See below for some of the specific options.

A lot of people (myself included), would suggest that you setup your receive hooks to deny non-fast-forward merges, which would largely make this issue moot (on the server, individuals could still lose unpushed local work).

gc.auto
When there are approximately more than this many loose objects in the repository, git gc --auto will pack them. Some Porcelain commands use this command to perform a light-weight garbage collection from time to time. The default value is 6700. Setting this to 0 disables it.
gc.pruneexpire
When git gc is run, it will call prune --expire 2.weeks.ago. Override the grace period with this config variable. The value "now" may be used to disable this grace period and always prune unreachable objects immediately.

gc.reflogexpire
gc.<pattern>.reflogexpire
git reflog expire removes reflog entries older than this time; defaults to 90 days. With "<pattern>" (e.g. "refs/stash") in the middle the setting applies only to the refs that match the <pattern>.

gc.reflogexpireunreachable
gc.<ref>.reflogexpireunreachable
git reflog expire removes reflog entries older than this time and are not reachable from the current tip; defaults to 30 days. With "<pattern>" (e.g. "refs/stash") in the middle, the setting applies only to the refs that match the <pattern>.


回答2:

Git does delete things eventually. It auto garbage collects every "5000 objects." I am not sure if that means 5000 commits or if it refers to something else. There are ways of undoing things, although when a person force pushes from an old version of a repo can being annoying. Depending on if you delete the bad push or revert there will be garbage in your git history, but git does eventually clean up.