In the Linux kernel code I found the following thing which I can not understand.
struct bts_action {
u16 type;
u16 size;
u8 data[0];
} __attribute__ ((packed));
The code is here: http://lxr.free-electrons.com/source/include/linux/ti_wilink_st.h
What's the need and purpose of an array of data with zero elements?
The idea is to allow for a variable-sized array at the end of the struct. Presumably,
bts_action
is some data packet with a fixed-size header (thetype
andsize
fields), and variable-sizedata
member. By declaring it as a 0-length array, it can be indexed just as any other array. You'd then allocate abts_action
struct, of say 1024-bytedata
size, like so:See also: http://c2.com/cgi/wiki?StructHack
Another not so often seen usage of zero length array is to get a named label inside a struct.
Suppose you have some large struct definitions (spans multiple cache lines) that you want to make sure they are aligned to cache line boundary both in the beginning and in the middle where it crosses the boundary.
In code you can declare them using GCC extensions like:
But you still want to make sure this is enforced in runtime.
This would work for a single struct, but it would be hard to cover many structs, each has different member name to be aligned. You would most likely get code like below where you have to find names of the first member of each struct:
Instead of going this way, you can declare a zero length array in the struct acting as a named label with a consistent name but does not consume any space.
Then the runtime assertion code would be much easier to maintain:
This is a hack actually, for GCC (C90) in fact.
It's also called a struct hack.
So the next time, I would say:
It will be equivalent to saying:
And I can create any number of such struct objects.
The code is not valid C (see this). The Linux kernel is, for obvious reasons, not in the slightest concerned with portability, so it uses plenty of non-standard code.
What they are doing is a GCC non-standard extention with array size 0. A standard compliant program would have written
u8 data[];
and it would have meant the very same thing. The authors of the Linux kernel apparently love to make things needlessly complicated and non-standard, if an option to do so reveals itself.In older C standards, ending a struct with an empty array was known as "the struct hack". Others have already explained its purpose in other answers. The struct hack, in the C90 standard, was undefined behavior and could cause crashes, mainly since a C compiler is free to add any number of padding bytes at the end of the struct. Such padding bytes may collide with the data you tried to "hack" in at the end of the struct.
GCC early on made a non-standard extension to change this from undefined to well-defined behavior. The C99 standard then adapted this concept and any modern C program can therefore use this feature without risk. It is known as flexible array member in C99/C11.
This is a way to have variable sizes of data, without having to call
malloc
(kmalloc
in this case) twice. You would use it like this:This used to be not standard and was considered a hack (as Aniket said), but it was standardized in C99. The standard format for it now is:
Note that you don't mention any size for the
data
field. Note also that this special variable can only come at the end of the struct.In C99, this matter is explained in 6.7.2.1.16 (emphasis mine):
Or in other words, if you have:
You can access
var->data
with indices in[0, extra)
. Note thatsizeof(struct something)
will only give the size accounting for the other variables, i.e. givesdata
a size of 0.It may be interesting also to note how the standard actually gives examples of
malloc
ing such a construct (6.7.2.1.17):Another interesting note by the standard in the same location is (emphasis mine):