Serializing / Marshalling simple objects in C# to

2019-08-12 15:22发布

问题:

I am writing a C# application that needs to communicate with a unmanaged C++ application over the network using pre-defined messages. Every message starts with a message id and length.

Byte 0  MessageId
Byte 1  Length

After that, it is different for each message. Such as a message to set the time, will be

Byte 0      MessageId
Byte 1      Length
Byte 2..5   Time

Starting out, I figured I would just create a base class which all other messages would inherit from, which would have a method to "serialize" it:

public class BaseMessage
{
    public virtual byte MessageId { get; }
    public virtual byte Length { get; }

    public byte[] GetAsMessage()
    {
        ...
    }
}

public class SetTimeMessage : BaseMessage
{
    public override byte MessageId => 1;
    public override byte Length {get; private set; }
    public byte[] Time {get; private set; }

    public SetTimeMessage(byte[] time)
    {
        ...
    }
}

In the C++ code, if I create a SetTimeMessage and want to send it over the network, a method like GetAsMessage which is defined on the base class is called, which will simply copy the contents of the object into a buffer. This works for all derived types of BaseMessage.

How can I do something similar in C#? I tried using the [Serializable] attribute and the BinaryFormatter, but it returned a huge byte array, not just the 6 bytes that the values actually contain.

I also looked into Marshalling, but it only seems to work with structs? If so, it seems like a lot of work needs to be done for each message type, since structs don't support inheritance (there are many messages I need to implement).

I will also receive messages back in the same format, which need to be deserialized back to an object. Anybody know of a neat way of achieving what I'm after?

Edit

I've done some experimenting with structs and marshalling, after input from Spo1ler.

Doing it this way, It's simple to serialize any message to a byte array

public interface IMsg
{
    byte MessageId { get; }
    byte Length { get; }
}

[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Ansi)]
public struct SetTime: IMsg
{
    public byte Id { get; }
    [MarshalAs(UnmanagedType.ByValArray, SizeConst = 4)]
    public byte[] Time;

    public StartLoggerWithSampleInterval(byte time)
    {
        Id = (byte) MessageType.SetTime;
        Time = time;
    }
}

I then have this static class to serialize the messages:

public static class MessageSerializer
{
    public static byte[] Serialize(IMsg msg)
    {
        int size = Marshal.SizeOf(msg);
        byte[] serialized = new byte[size];
        IntPtr ptr = Marshal.AllocHGlobal(size);

        Marshal.StructureToPtr(msg, ptr, true);
        Marshal.Copy(ptr, serialized, 0, size);
        Marshal.FreeHGlobal(ptr);

        return serialized;
    }
}

Initially this seems to be working like I want. Any problems that can show up doing it this way?

However, the deserialization of the messages will be a pain to write, though the first byte in the message (MessageId) should tell me what type of struct the byte array translates to

回答1:

BinaryFormatter saves the struct in a format that can be later read back again by .NET, it saves from which assembly the type came and so on and so forth, so it isn't useful in your case when you have a known byte structure of the message.

To have a structure that you can successfully serialize like Plain Old Data, you need to use StructLayoutAttribute on your struct like this

[StructLayout(LayoutKind.Explicit)]
public class Message
{
    [FieldOffset(0)] public byte MessageId;
    [FieldOffset(1)] public byte Length;
    [FieldOffset(2)] public byte[] Data;
}

StructLayoutAttribute with any other parameter then LayoutKind.Auto tells compiler to not change the layout of the structure in memory, but rather use the layout you provided. In this case, Explicit layout where you manually tell the offsets of the fields.

Also you should probably consider changing the message format, because having FieldOffsets that aren't the size of machine word can be quite expensive in terms of performance, because computers work much better with data that is aligned by the size of machine word, rather then arbitrary amount of byts.

It's a bad approach to try and build hierarchy using base classes in this manner, if you really want a hierarchy in yout message types, better use interfaces, but you still need to layout your data properly.

Then you can use marshaling to serialize your data to byte array, something like this

byte[] GetBytes(Message message) {
    int size = Marshal.SizeOf(message);
    byte[] arr = new byte[size];
    IntPtr ptr = Marshal.AllocHGlobal(size);

    Marshal.StructureToPtr(message, ptr, true);
    Marshal.Copy(ptr, arr, 0, size);
    Marshal.FreeHGlobal(ptr);

    return arr;
}

And if you want to deserialize, then you need a method like this

Message FromBytes(byte[] bytes) {
    Message message = new Message();

    int size = Marshal.SizeOf(message);
    IntPtr ptr = Marshal.AllocHGlobal(size);

    Marshal.Copy(arr, 0, ptr, size);

    message = (Message)Marshal.PtrToStructure(ptr, message.GetType());
    Marshal.FreeHGlobal(ptr);

    return message;
}

If you need to decide at runtime which class to use according to the message you've recieved, every one of these methods can become quite big and it's very specific to your hierarchy, but you can certainly do it by looking at first bytes and then deciding how to deserialize them accordingly.

But from my point of view it's a bad approach to try and do hierarchy based on some common parts of the message. You'd rather do some sequences of message of known types to distinct which message you will get next or something like this, but it's up to you. The information that I have provided here should be enough to serialize any integral type to array of bytes which you can later send over the network.