The SHA1 hashes stored in the tree objects (as returned by git ls-tree
) do not match the SHA1 hashes of the file content (as returned by sha1sum
)
$ git cat-file blob 4716ca912495c805b94a88ef6dc3fb4aff46bf3c | sha1sum
de20247992af0f949ae8df4fa9a37e4a03d7063e -
How does git compute file hashes? Does it compress the content before computing the hash?
git hash-object
This is a quick way to verify your test method:
Output:
where
sha1sum
is in GNU Coreutils.Then it comes down to understanding the format of each object type. We have already covered the trivial
blob
, here are the others:I needed this for some unit tests in Python 3 so thought I'd leave it here.
I stick to
\n
line endings everywhere but in some circumstances Git might also be changing your line endings before calculating this hash so you may need a.replace('\r\n', '\n')
in there too.Based on Leif Gruenwoldt answer, here is a shell function substitute to
git hash-object
:Test:
I am only expanding on the answer by
@Leif Gruenwoldt
and detailing what is in the reference provided by@Leif Gruenwoldt
Do It Yourself..
How does GIT compute its commit hashes
The text
blob⎵
is a constant prefix and\0
is also constant and is theNULL
character. The<size_of_file>
and<contents_of_file>
vary depending on the file.See: What is the file format of a git commit object?
And thats all folks!
But wait!, did you notice that the
<filename>
is not a parameter used for the hash computation? Two files could potentially have the same hash if their contents are same indifferent of the date and time they were created and their name. This is one of the reasons Git handles moves and renames better than other version control systems.Do It Yourself (Ext)
Note:
The link does not mention how the
tree
object is hashed. I am not certain of the algorithm and parameters however from my observation it probably computes a hash based on all theblobs
andtrees
(their hashes probably) it contains$ echo -e 'blob 14\0Hello, World!' | shasum 8ab686eafeb1f44702738c8b0f24f2567c36da6d
Source: http://alblue.bandlem.com/2011/08/git-tip-of-week-objects.html