Unions versus structures in C

2019-03-10 06:45发布

站内文章 / C

24 0

问题:

The idea behind this question is to understand the deeper concepts of using union and using it in different way so as to save memory.. My question to all is--

let's say there is a structure

struct strt
{
   float f;
   char c;
   int a;
}

and the same structure represented in union

union unin
{
   float f;
   char c;
   int a;
}

If I allocate values to structure members one after another and then print them, it gets printed. But in case of union it doesn't happen, some overwriting is being done..

So I need to find out a method which can store the values of f,c,a using union and then I can print the same. (Apply any operations or anything..) but I am in search of this technique.. Can anybody out there guide me or give me any idea?

回答1:

If you were to look at how a struct stores its values, it would be something like this:

|0---1---2---3---|4---|5---6---7---8---|
|ffffffffffffffff|    |                | <- f: Where your float is stored
|                |cccc|                | <- c: Where your char is stored
|                |    |aaaaaaaaaaaaaaaa| <- a: Where your int is stored

So when you change the value of f, you are actually changing bytes 0-3. When you change your char, you are actually changing byte 4. When you change your int, you are actually changing bytes 5-8.

If you now look at how a union stores its values, it would be something like this:

|0---1---2---3---|
|ffffffffffffffff| <- f: where your float is stored
|cccc------------| <- c: where your char is stored
|aaaaaaaaaaaaaaaa| <- a: where your int is stored

So now, when I change the value of f, I am changing bytes 0-3. Since c is stored in byte 0, when you change f, you also change c and a! When you change c, you're changing part of f and a - and when you change a, you're changing c and f. That's where your "overwriting" is happening. When you pack the 3 values into the one memory address, you're not "saving space" at all; you're just creating 3 different ways of looking at and changing the same data. You don't really have an int, a float, and a char in that union - at the physical level, you've just got 32 bits, which could be viewed as an int, a float, or a char. Changing one is meant to change the others. If you don't want them to change each other, then use a struct.

This is why gcc tells you that your struct is 9 bytes long, while your union is only 4 - it's not saving space - it's just that structs and unions are not the same thing.

回答2:

I think you misunderstand the purpose of a union.

A union, as the name suggests, defines a structure where all of its members occupy the same memory space. Whereas a struct places each of its members in separate memory in a single, contiguous area.

With your union, when you write:

union foo;
foo.c = 3;

Then foo.a and foo.f will both be changed. This is because .a, .c, and .f are stored at the same memory location. Thus, each member of a union is a different "view" of the same memory. This does not happen with a struct because all of the members are distinct and separate from each other.

There is no way around this behavior because it's intentional.

回答3:

I think you are misunderstanding Unions.

The idea behind using unions is toe save memory...

yes, that's one reason

... and get result equivalent to structure...

it's not equivalent. They looks similar in source code, but it is a completely different thing. Like apples and airplanes.

Unions are a very, very low level construct that allows you to see a piece of memory as if storing any of its "members", but you only can use one at a time. Even the use of the word "member" is extremely misleading. They should be called "views" or something, not members.

When you write:

union ABCunion
{
    int a;
    double b;
    char c;
} myAbc;

You are saying: "take a piece of memory big enough for the biggest among an int, a char and a double, and lets call it myAbc.

In that memory, now you can store either an int, or a double, or a char. If you store an int, and then store a double, the int is gone forever.

What's the point then?

There are two major uses for Unions.

a) Discriminated storage

That's what we did above. I pick a piece of memory and I give it different meanings depending on context. Sometimes the context is explicit (you keep some variable that indicates what "kind" of variable you stored), and sometimes it can be implicit (based of the section of code, you can tell which one must be in use). Either way, the code needs to be able to figure it out, or you won't be able to do anything sensible with the variable.

A typical (explicit) example would be:

struct MyVariantType
{
    int typeIndicator ;  // type=1 -> It's an int, 
                         // type=2 -> It's a  double, 
                         // type=3 -> It's a  char
    ABCunion body;
};

For example, VB6's "Variants" are Unions not unlike the above (but more complex).

b) Split representation This is sometimes useful when you need to be able to see a variable as either a "whole" or as a combination of parts. It's easier to explain with an example:

union DOUBLEBYTE
{
    struct
    {
        unsigned char a;
        unsigned char b;
    } bytes;
    short Integer;        
} myVar;

Here's a short int "unioned" with a pair of bytes. Now, you can view the same value as either a short int (myVar.Integer), or you can just as easily study the individual bytes that make part of the value (myVar.bytes.a and myVar.bytes.b).

Note that this second use is not portable (I'm pretty sure); meaning that it's not guaranteed to work across different machine architectures; but this use is absolutely essential for the kind of tasks for which C was designed (OS implementation).

回答4:

A union contains a set of mutually exclusive data.

In your particular example, you can store the float (f), char (c) or int (a) in the union. However, memory will only be allocated for the largest item in the union. All items in the union will share the same portion of memory. In other words, writing one value into the union followed by another will cause the first value to be overwritten.

You need to go back and ask yourself what you are modelling:

Do you truly want the values of f, c and a to be mutually exclusive (i.e. only one value can exist at once)? If so, consider using a union in conjunction with an enum value (stored outside the union) indicating which member in the union is the "active" one at any particular point in time. This will allow you to get the memory benefits of using a union, at the cost of more dangerous code (as anyone maintaining it will need to be aware that the values are mutually exclusive - i.e. it is indeed a union). Only consider this option if you are creating many of these unions and memory conservation is vital (e.g. on embedded CPUs). You may NOT even end up saving memory because you will need to create enum variables on the stack which will take up memory too.
Do you want these values to be simultaneously active and not interfere with each other? If so, you will need to use a struct instead (as you put in your first example). This will use more memory - when you instantiate a struct, the memory that is allocated is the sum of all members (plus some padding to the nearest word boundary). Unless memory conservation is of paramount importance (see previous example), I would favour this approach.

Edit:

(Very simple) example of how to use enums in conjunction with a union:

typedef union
{
    float f;
    char c;
    int a;
} floatCharIntUnion;

typedef enum
{
    usingFloat,
    usingChar,
    usingInt
} unionSelection;

int main()
{
    floatCharIntUnion myUnion;
    unionSelection selection;

    myUnion.f = 3.1415;
    selection = usingFloat;
    processUnion(&myUnion, selection);

    myUnion.c = 'a';
    selection = usingChar;
    processUnion(&myUnion, selection);

    myUnion.a = 22;
    selection = usingInt;
    processUnion(&myUnion, selection);
}

void processUnion(floatCharIntUnion* myUnion, unionSelection selection)
{

    switch (selection)
    {
    case usingFloat:
        // Process myUnion->f
        break;
    case usingChar:
        // Process myUnion->c
        break;
    case usingInt:
        // Process myUnion->a
        break;
    }
}

回答5:

This is a classic example of using a union to store data depending on an external marker.

The int, float and char * all occupy the same place in the union, they are not consecutive so, if you need to store them all, it's a structure you're looking for, not a union.

The structure is the size of the largest thing in the union plus the size of the type, since it's outside the union.

#define TYP_INT 0
#define TYP_FLT 1
#define TYP_STR 2

typedef struct {
    int type;
    union data {
        int a;
        float b;
        char *c;
    }
} tMyType;

static void printMyType (tMyType * x) {
    if (x.type == TYP_INT) {
        printf ("%d\n", x.data.a;
        return;
    }
    if (x.type == TYP_FLT) {
        printf ("%f\n", x.data.b;
        return;
    }
    if (x.type == TYP_STR) {
        printf ("%s\n", x.data.c;
        return;
    }
}

The printMyType function will correctly detect what's stored in the structure (unless you lie to it) and print out the relevant value.

When you populate one of them, you have to do:

x.type = TYP_INT;
x.data.a = 7;

x.type = TYP_STR;
x.data.c = "Hello";

and a given x can only be one thing at a time.

Woe betide anyone who tries:

x.type = TYP_STR;
x.data.a = 7;

They're asking for trouble.

回答6:

Unions are usually used when only one of the below would be stored in an instance at any given point of time. i.e. you can either store a float, a char or an int at any instant. This is to save memory - by not allocating extra/distinct memory for a float and an int, when you are just going to use it to store a char. The amount of memory allocated = largest type in union.

union unin
{
   float f;
   char c;
   int a;
}

The other use of union is when you want to store something that has parts, let sat you may want to model a register as a union containing the upper byte, lower byte and a composite value. So you can store a composite value into the union and use the members to get the pieces via the other members.