Deserialize a byte array to a struct

2019-02-13 20:52发布

I get a transmission over the network that's an array of chars/bytes. It contains a header and some data. I'd like to map the header onto a struct. Here's an example:

#pragma pack(1)

struct Header
{
    unsigned short bodyLength;
    int msgID;
    unsigned short someOtherValue;
    unsigned short protocolVersion;
};

int main()
{
    boost::array<char, 128> msgBuffer;
    Header header;

    for(int x = 0; x < sizeof(Header); x++)
        msgBuffer[x] = 0x01; // assign some values

    memcpy(&header, msgBuffer.data(), sizeof(Header));

    system("PAUSE");    

    return 0;
}

Will this always work assuming the structure never contains any variable length fields? Is there a platform independent / idiomatic way of doing this?

Note:

I have seen quite a few libraries on the internet that let you serialize/deserialize, but I get the impression that they can only deserialize something if it has ben previously serialized with the same library. Well, I have no control over the format of the transmission. I'm definitely going to get a byte/char array where all the values just follow upon each other.

6条回答
我只想做你的唯一
2楼-- · 2019-02-13 21:27

Just plain copying is very likely to break, at least if the data can come from a different architecture (or even just compiler) than what you are on. This is for reasons of:

That second link is GCC-specific, but this applies to all compilers.

I recommend reading the fields byte-by-byte, and assembling larger field (ints, etc) from those bytes. This gives you control of endianness and padding.

查看更多
可以哭但决不认输i
3楼-- · 2019-02-13 21:31

Some processors require that certain types are properly aligned. They will not accept the specified packing and generate a hardware trap.

And even on common x86 packed structures can cause the code to run more slowly.

Also you will have to take care when working with different endianness platforms.

By the way, if you want a simple and platform-independent communication mechanism with bindings to many programming languages, then have a look at YAMI.

查看更多
相关推荐>>
4楼-- · 2019-02-13 21:41

The #pragma pack(1) directive should work on most compilers but you can check by working out how big your data structure should be (10 in your case if my maths is correct) and using printf("%d", sizeof(Header)); to check that the packing is being done.

As others have said you still need to be wary of Endianness if you're going between architectures.

查看更多
你好瞎i
5楼-- · 2019-02-13 21:42

I strongly disagree with the idea of reading byte by byte. If you take care of the structure packing in the struct declaration, you can copy into the struct without a problem. For the endiannes problem again reading byte by byte solves the problem but does not give you a generic solution. That method is very lame. I have done something like this before for a similar job and it worked allright without a glitch.

Think about this. I have a structure, I also have a corresponding definition of that structure. You may construct this by hand but I have had written a parser for this and used it for other things as well.

For example, the definition of the structure you gave above is "s i s s". ( s = short , i = int ) Then I give the struct address , this definition and structure packing option of this struct to a special function that deals with the endiannes thing and voila it is done.

SwitchEndianToBig(&header, "s i s s", 4); // 4 = structure packing option

查看更多
小情绪 Triste *
6楼-- · 2019-02-13 21:42

Tell me if I'm wrong, but AFAIK, doing it that way will guarantee you that the data is correct - assuming the types have the same size on your different platforms :

#include <array>
#include <algorithm>

//#pragma pack(1) // not needed

struct Header
{
    unsigned short bodyLength;
    int msgID;
    unsigned short someOtherValue;
    unsigned short protocolVersion;
    float testFloat;

    Header() : bodyLength(42), msgID(34), someOtherValue(66), protocolVersion(69), testFloat( 3.14f ) {}
};

int main()
{
    std::tr1::array<char, 128> msgBuffer;
    Header header;

    const char* rawData = reinterpret_cast< const char* >( &header );

    std::copy( rawData, rawData + sizeof(Header), msgBuffer.data()); // assuming msgBuffer is always big enough

    system("PAUSE");    

    return 0;
}

If the types are different on your targeted plateforms, you have to uses aliases (typedef) for each type to be sure the size of each used type is the same.

查看更多
我只想做你的唯一
7楼-- · 2019-02-13 21:49

I know who I'm communicating with, so I don't really have to worry about endianness. But I like to stay away from compiler specific commands anyway.

So how about this:

const int kHeaderSizeInBytes = 6;

struct Header
{
    unsigned short bodyLength;
    unsigned short msgID;
    unsigned short protocolVersion; 

    unsigned short convertUnsignedShort(char inputArray[sizeof(unsigned short)])
        {return (((unsigned char) (inputArray[0])) << 8) + (unsigned char)(inputArray[1]);}

    void operator<<(char inputArray[kHeaderSizeInBytes])
    {
        bodyLength = convertUnsignedShort(inputArray);
        msgID = convertUnsignedShort(inputArray + sizeof(bodyLength));
        protocolVersion = convertUnsignedShort(inputArray + sizeof(bodyLength) + sizeof(msgID));
    }
};

int main()
{
    boost::array<char, 128> msgBuffer;
    Header header;

    for(int x = 0; x < kHeaderSizeInBytes; x++)
        msgBuffer[x] = x;

    header << msgBuffer.data();

    system("PAUSE");    

    return 0;
}

Gets rid of the pragma, but it isn't as general purpose as I'd like. Every time you add a field to the header you have to modify the << function. Can you iterate over struct fields somehow, get the type of the field and call the corresponding function?

查看更多
登录 后发表回答