Does BinaryFormatter apply any compression?

2019-04-25 08:19发布

问题:

When .NET's BinaryFormatter is used to serialize an object graph, is any type of compression applied?

I ask in the context of whether I should worry about the object graph having many repeated strings and integers.

Edit - Hold on, if strings are interned in .NET, there's no need to worry about repeated strings, right?

回答1:

No, it doesn't provide any compression but you can compress the output yourself using the GZipStream type.

Edit: Mehrdad has a wonderful example of this technique in his answer to How to compress a .net object instance using gzip.

Edit 2: Strings can be interned but that doesn't mean that every string is interned. I wouldn't make any assumptions on how or why the CLR decides to intern strings as this can change (and has changed) from version to version.



回答2:

No, it does not, but...

I just added GZipStream support for my app today, so I can share some code here;

Serialization:

using (Stream s = File.Create(PathName))
{
    RijndaelManaged rm = new RijndaelManaged();
    rm.Key = CryptoKey;
    rm.IV = CryptoIV;
    using (CryptoStream cs = new CryptoStream(s, rm.CreateEncryptor(), CryptoStreamMode.Write))
    {
        using (GZipStream gs = new GZipStream(cs, CompressionMode.Compress))
        {
            BinaryFormatter bf = new BinaryFormatter();
            bf.Serialize(gs, _instance);
        }
    }
}

Deserialization:

using (Stream s = File.OpenRead(PathName))
{
    RijndaelManaged rm = new RijndaelManaged();
    rm.Key = CryptoKey;
    rm.IV = CryptoIV;
    using (CryptoStream cs = new CryptoStream(s, rm.CreateDecryptor(), CryptoStreamMode.Read))
    {
        using (GZipStream gs = new GZipStream(cs, CompressionMode.Decompress))
        {
            BinaryFormatter bf = new BinaryFormatter();
            _instance = (Storage)bf.Deserialize(gs);
        }
    }
}

NOTE: if you use CryptoStream, it is kinda important that you chain (un)zipping and (de)crypting right this way, because you'll want to lose your entropy BEFORE encryption creates noise from your data.