145M = .git/objects/pack/
I wrote a script to add up the sizes of differences of each commit and the commit before it going backwards from the tip of each branch. I get 129MB, which is without compression and without accounting for same files across branches and common history among branches.
Git takes all those things into account so I would expect much much smaller repository. So why is .git so big?
I've done:
git fsck --full
git gc --prune=today --aggressive
git repack
To answer about how many files/commits, I have 19 branches about 40 files in each. 287 commits, found using:
git log --oneline --all|wc -l
It should not be taking 10's of megabytes to store information about this.
I recently pulled the wrong remote repository into the local one (
git remote add ...
andgit remote update
). After deleting the unwanted remote ref, branches and tags I still had 1.4GB (!) of wasted space in my repository. I was only able to get rid of this by cloning it withgit clone file:///path/to/repository
. Note that thefile://
makes a world of difference when cloning a local repository - only the referenced objects are copied across, not the whole directory structure.Edit: Here's Ian's one liner for recreating all branches in the new repo:
Other git objects stored in
.git
include trees, commits, and tags. Commits and tags are small, but trees can get big particularly if you have a very large number of small files in your repository. How many files and how many commits do you have?Did you try using git repack?
It is worth checking the stacktrace.log. It is basically an error log for tracing commits that failed. I've recently found out that my stacktrace.log is 65.5GB and my app is 66.7GB.
This can happen if you added a big chunk of files accidentally and staged them, not necessarily commit them. This can happen in a
rails
app when you runbundle install --deployment
and then accidentallygit add .
then you see all the files added undervendor/bundle
you unstage them but they already got into git history, so you have to apply Vi's answer and changevideo/parasite-intro.avi
byvendor/bundle
then run the second command he provides.You can see the difference with
git count-objects -v
which in my case before applying the script had a size-pack: of 52K and after applying it was 3.8K.git gc
already does agit repack
so there is no sense in manually repacking unless you are going to be passing some special options to it.The first step is to see whether the majority of space is (as would normally be the case) your object database.
This should give a report of how many unpacked objects there are in your repository, how much space they take up, how many pack files you have and how much space they take up.
Ideally, after a repack, you would have no unpacked objects and one pack file but it's perfectly normal to have some objects which aren't directly reference by current branches still present and unpacked.
If you have a single large pack and you want to know what is taking up the space then you can list the objects which make up the pack along with how they are stored.
Note that
verify-pack
takes an index file and not the pack file itself. This give a report of every object in the pack, its true size and its packed size as well as information about whether it's been 'deltified' and if so the origin of delta chain.To see if there are any unusally large objects in your repository you can sort the output numerically on the third of fourth columns (e.g.
| sort -k3n
).From this output you will be able to see the contents of any object using the
git show
command, although it is not possible to see exactly where in the commit history of the repository the object is referenced. If you need to do this, try something from this question.