Binary Formatter, Set Position to Deserialize Part

2019-08-28 16:35发布

问题:

I want to ask about serialize/deserialize object with binary formatter. well i'm trying to deserialize object in FileStream that contain many objects that has been serialized one by one. The size of an object is too big to be saved in process memmory that's why i don't pack all of objects in one such as: List because they are too big in process memory So i serialize as much as needed in many times. with this way it won't take many process memmory because i just process one object alternately not all of objects. take a look at sketch that i mean

<FileStream>
----Object 1-----Size = 100 Mb------index = 0
----Object 2-----Size = 100 Mb------index = 1
----Object 3-----Size = 100 Mb------index = 2
----Object 4-----Size = 100 Mb------index = 3
----Object 5-----Size = 100 Mb------index = 4
----Object 6-----Size = 100 Mb------index = 5
</FileStream>

Serialization object is also successfully now i got a problem to deserialized an object. here is the problem: in List we can take an item with index. so if we like to take fifth index we can call it such as:

    List<object> list = new List<object>();
    list(0) = "object1";
    list(1) = "object2";
    list(2) = "object3";
    list(3) = "object4";
    list(4) = "object5";
    list(5) = "object6";
    object fifthIndex = list[5]; // here we can get item based index

Well now the problem is how can i get objects with fifth index just like List Method on six Deserialization object in a filestream with Binary Formatter. i know in FileStream there is a property that named "FileStream.Position" but it does not like Index, it looks like a random number when i have deserialize/serialize an object. maybe it will increase random number.

actually i have succeeded with this but i bet this is not best way take a look at my code that i have ever tried:

object GetObjectStream(FileStream fs, int index)
{
    if (fs != null)
    {
        BinaryFormatter binaryformatter = new BinaryFormatter();
        bool isFinished = false; int count = 0;
        while (isFinished == false)
        {
            try
            {
                object objectdeserialized = binaryformatter.Deserialize(fs);
                if (count == index) return objectdeserialized;
                count++;
            }
            catch
            {
                isFinished = true;
                return null;
            }
        }
    }
    return null;
}

these codes will "foreach" every object that has been serialized and then deserialize every objects. i bet my codes are not the best way because to Deserialize object that contain 100 MB maybe it will take long time, I don't even know the object except index that ever be deserialized will be disposed or not? i want method just like a "Serialization Leap."

your answer is very helpfull and usefull for me if we can solve this problem.

Thanks before..

回答1:

Each object will most likely take a different amount of space to serialize - data packs differently, especially for things like strings and arrays. Basically, to do this efficiently (i.e. without reading every object in full each time), you would want to either:

  • prefix each object with the amount of data it takes, by serializing it to a MemoryStream, storing the .Length (any way that is convenient to you; a 4 byte little-endian chunk would suffice), and then copy the data you wrote to MemoryStream to the output; then you can skip to the n'th item by n-times-(read 4 bytes as an int, skipping that many bytes)
  • in a separate index, store the .Position of the base stream just before you serialize each object; then to read the nth object, you use the index to find the position you need, and scroll to there

Actually, you were quite lucky here: BinaryFormatter isn't actually documented as being safe to append, but as it happens it does kinda work out ok it you do that - but this isn't true for all serialization formats.

Personally, though, I'd question whether there is simply a different design that could be used here.