How to pack a struct in Visual Studio to 24 bits t

I am trying to port over an existing application from a 32-Bit ARM-microcontroller to desktop plattforms such as Microsoft Windows. GCC is used on the ARM and I was able successfully compile the application on windows using a 32-bit MinGW-compiler, however I had no success using Microsoft's Visual Studio Compiler and that is the reason why I am asking here for help.

Here is what my application is doing:

I have some framebuffer consisting of three bytes per pixel, so my memory looks like RGBRGBRGB and so on. I use a DMA-Channel on the ARM to push the pixels out to the display and my display directly understands this memory layout.

I also want to save some CPU cycles so I want to use ARM's saturated ADD __UQADD8 to draw on my framebuffer, performing saturated add on all three channels using a single operation.

For this to work, I need all three channels to be available in a single integer to be used as argument to __UQADD8.

Thats the reason why I use a union for one pixel of my framebuffer offers access to the separate channels by providing a struct containing each of R,G,B as uint8_t and providing the same memory as a 24 bit wide integer labeled data:

union Rgb {
    struct {
        uint8_t r;
        uint8_t g;
        uint8_t b;
    } ch;
    unsigned int data : 24 __attribute__((__packed__));
}

The width of 24 bits and the attribute packed is added to the data integer to restrict the width of the integer to three bytes. Then I can use the data in the pixel like this:

Rgb Rgb::operator+(const Rgb & op) {
    __UQADD8(data, op.data);
    return Rgb(data);
}

Note that __UQADD8 magically does only write to three of the four bytes of my integer and does not alter the R-channel of the next RGB in my framebuffer.

The following test program proves that my RGB's are all packed tight when using GCC:

#include <iostream>
#include <stdint.h>

union Rgb {
struct {
                uint8_t r;
                uint8_t g;
                uint8_t b;
        } ch;
    unsigned int data : 24 __attribute__((packed));
} ;

int main()
{
    std::cout << "Sizeof(Rgb) = "  << sizeof(Rgb) << std::endl;
    return 0;
}

To compile the example using MSVC, one has to remove the __attribute__packed. The program runs, but gives 4 as output. There is number of other attributes in MSVC that can be used including #pragma pack, unaligned pointers __attribute__(aligned) and so on, but I found no combination that packs my struct to a size of three bytes.

How to port my application to Microsoft's compiler while keeping functionality and preferable compatibility to GCC?

Bonus question: How to keep functionality when compiling using a 64-bit compiler, either GCC or MSVC?

The answer for this SO question and this question mention bit-packing being "implementation defined" and lead to this official MSVC documentation page which says:

Adjacent bit fields are packed into the same 1-, 2-, or 4-byte allocation unit...

So it seems you cannot get an exact 3-byte bit field in MSVC. The only alternative I can think of is to do something like:

#pragma pack(push, 1)
union Rgb {
     struct {
          uint8_t r;
          uint8_t g;
          uint8_t b;
     } ch;
    unsigned char data[3];
};
#pragma pack(pop)

which will give you the desired 3-byte union size but may not be compatible with using __UQADD8(), at least directly.