I recently ran git fsck --lost-found
on my repository.
I expected to see a couple dangling commits, where I had reset HEAD
.
However, I was surprised to see likely over several thousand dangling blob messages.
I don't believe anything is wrong with my repository, but I'm curious as to what causes these dangling blobs? There's only two people working on the repository, and we haven't done anything out of the ordinary.
I wouldn't think they were created by an older version of a file being replaced by a new one, since git would need to hold onto both blobs so it can display history.
Come to think of it, at one point we did add a VERY large directory (thousands of files) to the project by mistake and then remove it. Might this be the source of all the dangling blobs?
Just looking for insight into this mystery.
Whenever you
add
a file to the index, the content of that file are added to Git's object database as a blob. When you thenreset
/rm --cached
that file, the blobs will still exist (they will be garbage collected the next time you rungc
)However, when those files are part of a commit and you decide later to
reset
history, then the old commits are still reachable from Git's reflog and will only be garbage collected after a period of time (usually a month, iirc). Those objects should not show up as dangling though, since they are still referenced from the reflog.Last time I looked at this I stumbled across this thread, specifically this part:
So it's normal behavior, and does get collected eventually, I believe.
edit: Per Daniel, you can immediately collect it by running
I was really impatient and used: