Are packed structs portable?

2019-03-09 00:18发布

站内文章 / C++

63 0

闹够了就滚

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have some code on a Cortex-M4 microcontroller and'd like to communicate with a PC using a binary protocol. Currently, I'm using packed structs using the GCC-specific packed attribute.

Here is a rough outline:

struct Sensor1Telemetry {
    int16_t temperature;
    uint32_t timestamp;
    uint16_t voltageMv;
    // etc...
} __attribute__((__packed__));

struct TelemetryPacket {
    Sensor1Telemetry tele1;
    Sensor2Telemetry tele2;
    // etc...
} __attribute__((__packed__));

My question is:

Assuming that I use the exact same definition for the TelemetryPacket struct on the MCU and the client app, will the above code be portable accross multiple platforms? (I'm interested in x86 and x86_64, and need it to run on Windows, Linux and OS X.)
Do other compilers support packed structs with the same memory layout? With what syntax?

EDIT:

Yes, I know packed structs are non-standard, but they seem useful enough to consider using them.
I'm interested in both C and C++, although I don't think GCC would handle them differently.
These structs are not inherited and don't inherit anything.
These structs only contain fixed-size integer fields, and other similar packed structs. (I've been burned by floats before...)

回答1:

You should never use structs across compile domains, against memory (hardware registers, picking apart items read from a file or passing data between processors or the same processor different software (between an app and a kernel driver)). You are asking for trouble as the compiler has somewhat free will to choose alignment and then the user on top of that can make it worse by using modifiers.

No there is no reason to assume you can do this safely across platforms, even if you use the same gcc compiler version for example against different targets (different builds of the compiler as well as the target differences).

To reduce your odds of failure start with the largest items first (64 bit then 32 bit the 16 bit then lastly any 8 bit items) Ideally align on 32 minimum perhaps 64 which one would hope arm and x86 do, but that can always change as well as the default can be modified by whomever builds the compiler from sources.

Now if this is a job security thing, sure go ahead, you can do regular maintenance on this code, likely going to need a definition of each structure for each target (so one copy of the source code for the structure definition for ARM and another for x86, or will need this eventually if not immediately). And then every or every few product releases you get to be called in to do work on the code...Nice little maintenance time bombs that go off...

If you want to safely communicate between compile domains or processors the same or different architectures, use an array of some size, a stream of bytes a stream of halfwords or a stream of words. Significantly reduces your risk of failure and maintenance down the road. Do not use structures to pick apart those items that just restores the risk and failure.

The reason why folks seem to think this is okay because of using the same compiler or family against the same target or family (or compilers derived from other compilers choices), as you understand the rules of the language and where the implementation defined areas are you will eventually run across a difference, sometimes it takes decades in your career, sometimes it takes weeks...Its the "works on my machine" problem...

回答2:

Considering the mentioned platforms, yes, packed structs are completely fine to use. x86 and x86_64 always supported unaligned access, and contrary to the common belief, unaligned access on these platforms has (almost) the same speed as aligned access for a long time (there's no such thing that unaligned access is much slower). The only drawback is that the access may not be atomic, but I don't think it matters in this case. And there is an agreement between compilers, packed structs will use the same layout.

GCC/clang supports packed structs with the syntax you mentioned. MSVC has #pragma pack, which can be used like this:

#pragma pack(push, 1)
struct Sensor1Telemetry {
    int16_t temperature;
    uint32_t timestamp;
    uint16_t voltageMv;
    // etc...
};
#pragma pack(pop)

Two issues can arise:

Endianness must be the same across platforms (your MCU must be using little-endian)
If you assign a pointer to a packed struct member, and you're on an architecture which doesn't support unaligned access (or use instructions which have alignment requirements, like movaps or ldrd), then you may get a crash using that pointer (gcc doesn't warn you about this, but clang does).

Here's the doc from GCC:

The packed attribute specifies that a variable or structure field should have the smallest possible alignment—one byte for a variable

So GCC guarantees that no padding will be used.

MSVC:

To pack a class is to place its members directly after each other in memory

So MSVC guarantees that no padding will be used.

The only "dangerous" area I've found, is the usage of bitfields. Then the layout may differ between GCC and MSVC. But, there's an option in GCC, which makes them compatible: -mms-bitfields

Tip: even, if this solution works now, and it is highly unlikely that it will stop working, I recommend you keep dependency of your code on this solution low.

Note: I've considered only GCC, clang and MSVC in this answer. There are compilers maybe, for which these things are not true.

回答3:

endianness is not an issue
both compilers handle packing correctly
the type definitions on both C implementations are accurate (Standard compliant).

then yes, "packed structures" are portable.

For my taste too many "if"s, do not do this. It's not worth the hassle to arise.

回答4:

You could do that, or use a more reliable alternative.

For the hard core amongst the serialisation fanatics out there, there's CapnProto. This gives you a native structure to deal with, and undertakes to ensure that when it's transferred across a network and lightly worked on, it'll still make sense the other end. To call it a serialisation is nearly inaccurate; it aims to do a little as possible to the in-memmory representation of a structure. Might be amenable to porting to an M4

There's Google Protocol Buffers, that's binary. More bloaty, but pretty good. There's the accompanying nanopb (more suited to microcontrollers), but it doesn't do the whole of GPB (I don't think it does oneof). Many people use it successfully though.

Some of the C asn1 runtimes are small enough for use on micro controllers. I know this one fits on M0.

回答5:

If you want something maximally portable, you can declare a buffer of uint8_t[TELEM1_SIZE] and memcpy() to and from offsets within it, performing endianness conversions such as htons() and htonl() (or little-endian equivalents such as the ones in glib). You could wrap this in a class with getter/setter methods in C++, or a struct with getter-setter functions in C.

回答6:

It strongly depends on what struct is, bear in mind that in C++ struct is a class with default visibility public.

So you can inherit and even add virtual to this so this could break things for you.

If it is a pure data class (in C++ terms a standard layout class) this should work in combination with packed.

Also bear in mind, that if you start doing this you might get problems with strict aliasing rules of your compiler, because you will have to look at the byte representation of your memory (-fno-strict-aliasing is your friend).

Note

That being said I would strongly advise against using that for serialization. If you use tools for this (i.e.: protobuf, flatbuffers, msgpack, or others) you get a ton of features:

language independence
rpc (remote procedure call)
data specification languages
schemas/validation
versioning

回答7:

Here is pseudo code towards an algorithm that may fit your needs to ensure the use with the proper target OS and platform.

If using the C language you will not be able to use classes, templates and a few other things, but you can use preprocessor directives to create the version of your struct(s) you need based on the OS, the architect CPU-GPU-Hardware Controller Manufacturer {Intel, AMD, IBM, Apple, etc.}, platform x86 - x64 bit, and finally the endian of the byte layout. Otherwise the focus here would be towards C++ and the use of templates.

Take your struct(s) for example:

struct Sensor1Telemetry {
    int16_t temperature;
    uint32_t timestamp;
    uint16_t voltageMv;
    // etc...
} __attribute__((__packed__));

struct TelemetryPacket {
    Sensor1Telemetry tele1;
    Sensor2Telemetry tele2;
    // etc...
} __attribute__((__packed__));

You could template these structs as such:

enum OS_Type {
    // Flag Bits - Windows First 4bits
    WINDOWS    = 0x01  //  1
    WINDOWS_7  = 0x02  //  2 
    WINDOWS_8  = 0x04, //  4
    WINDOWS_10 = 0x08, //  8

    // Flag Bits - Linux Second 4bits
    LINUX      = 0x10, // 16
    LINUX_vA   = 0x20, // 32
    LINUX_vB   = 0x40, // 64
    LINUX_vC   = 0x80, // 128

    // Flag Bits - Linux Third Byte
    OS         = 0x100, // 256
    OS_vA      = 0x200, // 512
    OS_vB      = 0x400, // 1024
    OS_vC      = 0x800  // 2048

    //....
};

enum ArchitectureType {
    ANDROID = 0x01
    AMD     = 0x02,
    ASUS    = 0x04,
    NVIDIA  = 0x08,
    IBM     = 0x10,
    INTEL   = 0x20,
    MOTOROALA = 0x40,
    //...
};

enum PlatformType {
    X86 = 0x01,
    X64 = 0x02,
    // Legacy - Deprecated Models
    X32 = 0x04,
    X16 = 0x08,
    // ... etc.
};

enum EndianType {
    LITTLE = 0x01,
    BIG    = 0x02,
    MIXED  = 0x04,
    // ....
};

// Struct to hold the target machines properties & attributes: add this to your existing struct.

struct TargetMachine {
    unsigned int os_;
    unsigned int architecture_;
    unsigned char platform_;
    unsigned char endian_;

    TargetMachine() : 
      os_(0), architecture_(0),
      platform_(0), endian_(0) {
    }

    TargetMachine( unsigned int os, unsigned int architecture_, 
                   unsigned char platform_, unsigned char endian_ ) :
      os_(os), architecture_(architecture),
      platform_(platform), endian_(endian) {
    }    
};

template<unsigned int OS, unsigned int Architecture, unsigned char Platform, unsigned char Endian>
struct Sensor1Telemetry {       
    int16_t temperature;
    uint32_t timestamp;
    uint16_t voltageMv;
    // etc...
} __attribute__((__packed__));

template<unsigned int OS, unsigned int Architecture, unsigned char Platform, unsigned char Endian>
struct TelemetryPacket {
    TargetMachine targetMachine { OS, Architecture, Platform, Endian };
    Sensor1Telemetry tele1;
    Sensor2Telemetry tele2;
    // etc...
} __attribute__((__packed__));

With these enum identifiers you could then use class template specialization to set the up this class to its needs depending on the above combinations. Here I would take all the common cases that would seem to work fine with default class declaration & definition and set that as the main class's functionality. Then for those special cases, such as different Endian with byte order, or specific OS versions doing something in a different way, or GCC versus MS compilers with the use of __attribute__((__packed__)) versus #pragma pack() can then be the few specializations that need to be accounted for. You shouldn't need to specify a specialization for every possible combination; that would be too daunting and time consuming, should only need to do the few rare case scenarios that can occur to make sure you always have proper code instructions for the target audience. What also makes the enums very handy too is that if you pass these as a function argument, you can set multiple ones at a time as they are designed as bit flags. So if you want to create a function that takes this template struct as its first argument, then supported OS's as its second you could then pass in all available OS support as bit flags.

This may help to ensure that this set of packed structures is being "packed" and or aligned correctly according to the appropriate target and that it will always perform the same functionality to maintain portability across different platforms.

Now you may have to do this specialization twice between the preprocessor directives for different supporting compilers. Such that if the current compiler is GCC as it defines the struct in one way with its specializations, then Clang in another, or MSVC, Code Blocks etc. So there is a little overhead to get this initially set up, but it should, could highly ensure that it is being properly used in the specified scenario or combination of attributes of the target machine.

回答8:

Speaking about alternatives and considering your question Tuple-like container for packed data (for which I don't have enough reputation to comment on), I suggest having a look at Alex Robenko's CommsChampion project:

COMMS is the C++(11) headers only, platform independent library, which makes the implementation of a communication protocol to be an easy and relatively quick process. It provides all the necessary types and classes to make the definition of the custom messages, as well as wrapping transport data fields, to be simple declarative statements of type and class definitions. These statements will specify WHAT needs to be implemented. The COMMS library internals handle the HOW part.

Since you're working on a Cortex-M4 microcontroller, you may also find interesting that:

The COMMS library was specifically developed to be used in embedded systems including bare-metal ones. It doesn't use exceptions and/or RTTI. It also minimises usage of dynamic memory allocation and provides an ability to exclude it altogether if required, which may be needed when developing bare-metal embedded systems.

Alex provides an excellent free ebook titled Guide to Implementing Communication Protocols in C++ (for Embedded Systems) which describes the internals.

回答9:

Not always. When you send data to different architect processor, you need to consider about Endianness, primitive data type, etc. Better to use Thrift or Message Pack. If not, create yourself Serialize and DeSerialize methods instead.