Decoding git objects / “Block length does not matc

2019-04-08 15:44发布

问题:

I'm stuck with very simple but annoying issue, and cannot find answer on the Internet. Hope you will be able to point me, what I've done wrong.

I'm trying to decode object from Git repository. According to ProGit, file name and it's contents have been deflated during commit.

I'm using C# to read object indicated by SHA1 into a stream, inflate it and convert into byte array. Here is the code:

using System.IO.Compression;

static internal byte[] GetObjectBySha(string storagePath, string sha)
{
    string filePath = Path.Combine(storagePath, "objects", sha.Substring(0, 2), sha.Substring(2, 38));
    byte[] fileContent = null;

    using (FileStream fs = new FileStream(filePath, FileMode.Open, FileAccess.Read))
    {
        using (MemoryStream ms = new MemoryStream())
        {
            using (DeflateStream gs = new DeflateStream(fs, CompressionMode.Decompress))
            {
                gs.CopyTo(ms);
            }

            fileContent = ms.ToArray();
        }
    }

    return fileContent;
}

When gs.CopyTo(ms); is reached the runtime error occurs: Block length does not match with its complement.

Why so?

Regarding the content of the file I'm trying to read... It's binary and it was created by git executable. The original file name is testfile.txt, it's content is Sample text. the SHA1 is 51d0be227ecdc0039698122a1513421ce35c1dbe.

Any idea would be greatly appreciated!

回答1:

DeflateStream and zlib are two different things as explained in this answer:

There is no ZlibStream in the .NET base class library - nothing that produces or consumes ZLIB

So what you need is a ZLIB consumer. The DotNetZip library provides one:

static internal byte[] GetObjectBySha(string storagePath, string sha)
{
    string filePath = Path.Combine(storagePath, "objects", sha.Substring(0, 2), sha.Substring(2, 38));
    byte[] compressed = File.ReadAllBytes(filePath);
    return Ionic.Zlib.ZlibStream.UncompressBuffer(compressed);
}


回答2:

ZLib is Deflate with an additional two byte header, an optional "dictionary" and a four byte checksum at the end. Depending on your application - like if you know there isn't going to be a dictionary - you may be able to get away with chopping off the first two bytes and last four bytes from the data before running it through the DeflateStream. Its a dirty solution, but could save you from having to bring in an external dependency.