I've used structs extensively and I've seen some interesting things, especially *value
instead of value->first_value
where value is a pointer to struct, first_value
is the very first member, is *value
safe?
Also note that sizes aren't guaranteed because of alignment, whats the alginment value based on, the architecture/register size?
We align data/code for faster execution can we tell compiler not to do this? so maybe we can guarantee certain things about structs, like their size?
When doing pointer arithmetic on struct members in order to locate member offset, I take it you do -
if little endian +
for big endian, or does it just depend on the compiler?
what does malloc(0) really allocate?
The following code is for educational/discovery purposes, its not meant to be of production quality.
#include <stdlib.h>
#include <stdio.h>
int main()
{
printf("sizeof(struct {}) == %lu;\n", sizeof(struct {}));
printf("sizeof(struct {int a}) == %lu;\n", sizeof(struct {int a;}));
printf("sizeof(struct {int a; double b;}) == %lu;\n", sizeof(struct {int a; double b;}));
printf("sizeof(struct {char c; double a; double b;}) == %lu;\n", sizeof(struct {char c; double a; double b;}));
printf("malloc(0)) returns %p\n", malloc(0));
printf("malloc(sizeof(struct {})) returns %p\n", malloc(sizeof(struct {})));
struct {int a; double b;} *test = malloc(sizeof(struct {int a; double b;}));
test->a = 10;
test->b = 12.2;
printf("test->a == %i, *test == %i \n", test->a, *(int *)test);
printf("test->b == %f, offset of b is %i, *(test - offset_of_b) == %f\n",
test->b, (int)((void *)test - (void *)&test->b),
*(double *)((void *)test - ((void *)test - (void *)&test->b))); // find the offset of b, add it to the base,$
free(test);
return 0;
}
calling gcc test.c
followed by ./a.out
I get this:
sizeof(struct {}) == 0;
sizeof(struct {int a}) == 4;
sizeof(struct {int a; double b;}) == 16;
sizeof(struct {char c; double a; double b;}) == 24;
malloc(0)) returns 0x100100080
malloc(sizeof(struct {})) returns 0x100100090
test->a == 10, *test == 10
test->b == 12.200000, offset of b is -8, *(test - offset_of_b) == 12.200000
Update this is my machine:
gcc --version
i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3)
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
uname -a
Darwin MacBookPro 10.8.0 Darwin Kernel Version 10.8.0: Tue Jun 7 16:33:36 PDT 2011; root:xnu-1504.15.3~1/RELEASE_I386 i386
Calling
malloc(0)
will return a pointer that may be safely passed tofree()
at least once. If the same value is returned by multiplemalloc(0)
calls, it may be freed once for each such call. Obviously, if it returnsNULL
, that could be passed tofree()
an unlimited number of times without effect. Every call tomalloc(0)
which returns non-null should be balanced by a call tofree()
with the returned value.From 6.2.5/20:
To answer:
see 6.7.2.1/15:
There may however be padding bytes at the end of the structure as also in-between members.
In C,
malloc( 0 )
is implementation defined. (As a side note, this is one of those little things where C and C++ differ.)[1] Emphasis mine.
It is important to learn about the way that size of struct works. for example:
What this means is that struct sizes and offsets have to follow two rules:
1- The struct must be a multiple of it's first element (Why foo is 8 not 5 bytes)
2- A struct element must start on a multiple of itself. (In baz, int j could not start on 6, so bytes 6, 7, and 8 are wasted padding
Yes,
*value
is safe; it yields a copy of the structure thatvalue
points at. But it is almost guaranteed to have a different type from*value->first_value
, so the result of*value
will almost always be different from*value->first_value
.Counter-example:
Under this rather limited set of circumstances, you would get the same result from
*value
and*value->first_value
. Under that scheme, the types would be the same (even if the values are not). In the general case, the type of*value
and*value->first_value
are of different types.Since 'register size' is not a defined C concept, it isn't clear what you're asking. In the absence of pragmas (
#pragma pack
or similar), the elements of a structure will be aligned for optimal performance when the value is read (or written).The compiler is in charge of the size and layout of
struct
types. You can influence by careful design and perhaps by#pragma pack
or similar directives.These questions normally arise when people are concerned about serializing data (or, rather, trying to avoid having to serialize data by processing structure elements one at a time). Generally, I think you're better off writing a function to do the serialization, building it up from component pieces.
You're probably best off not doing pointer arithmetic on
struct
members. If you must, use theoffsetof()
macro from<stddef.h>
to handle the offsets correctly (and that means you're not doing pointer arithmetic directly). The first structure element is always at the lowest address, regardless of big-endianness or little-endianness. Indeed, endianness has no bearing on the layout of different members within a structure; it only has an affect on the byte order of values within a (basic data type) member of a structure.The C standard requires that the elements of a structure are laid out in the order that they are defined; the first element is at the lowest address, and the next at a higher address, and so on for each element. The compiler is not allowed to change the order. There can be no padding before the first element of the structure. There can be padding after any element of the structure as the compiler sees fit to ensure what it considers appropriate alignment. The size of a structure is such that you can allocate (N × size) bytes that are appropriately aligned (e.g. via
malloc()
) and treat the result as an array of the structure.If you have an inner structure it is guaranteed to start on the same address as the enclosing one if that is the first declaration of the enclosing structure.
So
*value
andvalue->first
is accessing memory at the same address (but using different types) in the followingAlso, the ordering between memebers of the structure is guaranteed to be the same as the declaration order
To adjust alignment, you can use compiler specific directives or use bitfields.
The alignment of structure memebers are usually based on what's best to access the individual members on the target platform
Also, for
malloc
, it is possible it keeps some bookkeeping near the returned address, so even for zero-size memory it can return a valid address (just don't try to access anything via the returned address)