I'm trying to make a type-safe generic linked list in C using macros. It should work similarly to how templates work in C++. For example,
LIST(int) *list = LIST_CREATE(int);
My first attempt was for #define LIST(TYPE)
(the macro I used above) to define a struct _List_##TYPE {...}
. That, however, did not work because the struct would be redefined every time I declared a new list. I remedied the problem by doing this:
/* You would first have to use this macro, which will define
the `struct _List_##TYPE`... */
DEFINE_LIST(int);
int main(void)
{
/* ... And this macro would just be an alias for the struct, it
wouldn't actually define it. */
LIST(int) *list = LIST_CREATE(int);
return 0;
}
/* This is how the macros look like */
#define DEFINE_LIST(TYPE) \
struct _List_##TYPE \
{ \
... \
}
#define LIST(TYPE) \
struct _List_##TYPE
But another problem is that when I have multiple files that use DEFINE_LIST(int)
, for example, and some of them include each other, then there will still be multiple definitions of the same struct. Is there any way to make DEFINE_LIST
check if the struct has already been defined?
/* one.h */
DEFINE_LIST(int);
/* two.h */
#include "one.h"
DEFINE_LIST(int); /* Error: already defined in one.h */
I tackled this problem in C before C++ acquired templates and I still
have code.
You can't define a truly generic typesafe container-of-T template with macros
in a way that's confined entirely to header files. The standard preprocessor
provides no means of "pushing" and "popping" the macro assignments you will
require so as preserve their integrity through nested and sequential
contexts of expansion. And you will encounter nested contexts as soon as you
try to eat your own dog food by defining a container-of-containers-of-T.
The thing can be done, as we'll see, but as @immortal suggests, it entails
generating distinct .h
and .c
files for each value of T that you require.
You can, for example, define a completely generic list-of-T with macros in
an inline file, say, list_type.inl
, and then include list_type.inl
in a
each of pair of small set-up wrappers - list_float.h
and list_float.c
- that
will respectively define and implement the list-of-float container. Similarly
for list-of-int, list-of-list-of-float, list-of-vector-of-list-of-double,
and so so.
A schematic example will make all clear. But first just get the full measure of
the eat-your-own-dogfood challenge.
Consider such a second-order container as a list-of-lists-of-thingummy. We want to
be able to instantiate these by setting T = list-of-thingummy for our macro
list-of-T solution. But in no way is list-of-thingummy going to be a POD
datatype. Whether list-of-thingummy is our own dogfood or somebody else's, it's
going to be an abstract datatype that lives on the heap and is represented to
its users through a typedef-ed pointer type. Or at the very least, it is going
to have dynamic components held on the heap. In any case, not POD.
This means it's not enough for our list-of-T solution just to be told that
T = list-of-thingummy. It must also be told whether a T requires non-POD
copy-construction and destruction, and if so how to copy-construct and destroy
one. In C terms, that means:
Copy-construction: How to create a copy of a given T in a T-sized
region of uncommitted memory, given the address of such a region.
Destruction: How to destroy the T at a given address.
We can do without knowing about default construction or construction from
non-T parameters, as we can reasonably restrict our list-of-T solution to
the containment of objects copied from user-supplied originals. But we do
have to copy them, and we have to dispose of our copies.
Next, suppose that we aspire to offer a template for set-of-T, or map-of-T1-to-T2,
in addition to list-of-T. These key-ordered datatypes add another parameter
we will have to plug in for any non-POD value of T or T1, namely how to order
any two objects of the key type. Indeed we will need that parameter for
any key datatype for which memcmp()
won't do.
Having noted that, we'll stick with the simpler list-of-T problem for the
schematic example; and for further simplicity I'll forget the desirability
of any const
API.
For this and any other template container type we'll want some token-pasting
macros that let us conveniently assemble identifiers of functions and types,
plus probably other utility macros. These can all go in a header, say macro_kit.h
,
such as:
#ifndef MACRO_KIT_H
#define MACRO_KIT_H
/* macro_kit.h */
#define _CAT2(x,y) x##y
// Concatenate 2 tokens x and y
#define CAT2(x,y) _CAT2(x,y)
// Concatenate 3 tokens x, y and z
#define CAT3(x,y,z) CAT2(x,CAT2(y,z))
// Join 2 tokens x and y with '_' = x_y
#define JOIN2(x,y) CAT3(x,_,y)
// Join 3 tokens x, y and z with '_' = x_y_z
#define JOIN3(x,y,z) JOIN2(x,JOIN2(y,z))
// Compute the memory footprint of n T's
#define SPAN(n,T) ((n) * sizeof(T))
#endif
Now to the schematic structure of list_type.inl
:
//! There is intentionally no idempotence guard on this file
#include "macro_kit.h"
#include <stddef.h>
#ifndef INCLUDE_LIST_TYPE_INL
#error This file should only be included from headers \
that define INCLUDE_LIST_TYPE_INL
#endif
#ifndef LIST_ELEMENT_TYPE
#error Need a definition for LIST_ELEMENT_TYPE
#endif
/* list_type.inl
Defines and implements a generic list-of-T container
for T the current values of the macros:
- LIST_ELEMENT_TYPE:
- must have a definition = the datatype (or typedef alias) for \
which a list container is required.
- LIST_ELEMENT_COPY_INITOR:
- If undefined, then LIST_ELEMENT_TYPE is assumed to be copy-
initializable by the assignment operator. Otherwise must be defined
as the name of a copy initialization function having a prototype of
the form:
LIST_ELEMENT_TYPE * copy_initor_name(LIST_ELEMENT_TYPE *pdest,
LIST_ELEMENT_TYPE *psrc);
that will attempt to copy the LIST_ELEMENT_TYPE at `psrc` into the
uncommitted memory at `pdest`, returning `pdest` on success and NULL
on failure.
N.B. This file itself defines the copy initializor for the list-type
that it generates.
- LIST_ELEMENT_DISPOSE
If undefined, then LIST_ELEMENT_TYPE is assumed to need no
destruction. Otherwise the name of a destructor function having a
protoype of the form:
void dtor_name(LIST_ELEMENT_TYPE pt*);
that appropriately destroys the LIST_ELEMENT_TYPE at `pt`.
N.B. This file itself defines the destructor for the list-type that
it generates.
*/
/* Define the names of the list-type to generate,
e.g. list_int, list_float
*/
#define LIST_TYPE JOIN2(list,LIST_ELEMENT_TYPE)
/* Define the function-names of the LIST_TYPE API.
Each of the API macros LIST_XXXX generates a function name in
which LIST becomes the value of LIST_TYPE and XXXX becomes lowercase,
e.g list_int_new
*/
#define LIST_NEW JOIN2(LIST_TYPE,new)
#define LIST_NODE JOIN2(LIST_TYPE,node)
#define LIST_DISPOSE JOIN2(LIST_TYPE,dispose)
#define LIST_COPY_INIT JOIN2(LIST_TYPE,copy_init)
#define LIST_COPY JOIN2(LIST_TYPE,copy)
#define LIST_BEGIN JOIN2(LIST_TYPE,begin)
#define LIST_END JOIN2(LIST_TYPE,end)
#define LIST_SIZE JOIN2(LIST_TYPE,size)
#define LIST_INSERT_BEFORE JOIN3(LIST_TYPE,insert,before)
#define LIST_DELETE_BEFORE JOIN3(LIST_TYPE,delete,before)
#define LIST_PUSH_BACK JOIN3(LIST_TYPE,push,back)
#define LIST_PUSH_FRONT JOIN3(LIST_TYPE,push,front)
#define LIST_POP_BACK JOIN3(LIST_TYPE,pop,back)
#define LIST_POP_FRONT JOIN3(LIST_TYPE,pop,front)
#define LIST_NODE_GET JOIN2(LIST_NODE,get)
#define LIST_NODE_NEXT JOIN2(LIST_NODE,next)
#define LIST_NODE_PREV JOIN2(LIST_NODE,prev)
/* Define the name of the structure used to implement a LIST_TYPE.
This structure is not exposed to user code.
*/
#define LIST_STRUCT JOIN2(LIST_TYPE,struct)
/* Define the name of the structure used to implement a node of a LIST_TYPE.
This structure is not exposed to user code.
*/
#define LIST_NODE_STRUCT JOIN2(LIST_NODE,struct)
/* The LIST_TYPE API... */
// Define the abstract list type
typedef struct LIST_STRUCT * LIST_TYPE;
// Define the abstract list node type
typedef struct LIST_NODE_STRUCT * LIST_NODE;
/* Return a pointer to the LIST_ELEMENT_TYPE in a LIST_NODE `node`,
or NULL if `node` is null
*/
extern LIST_ELEMENT_TYPE * LIST_NODE_GET(LIST_NODE node);
/* Return the LIST_NODE successor of a LIST_NODE `node`,
or NULL if `node` is null.
*/
extern LIST_NODE LIST_NODE_NEXT(LIST_NODE node);
/* Return the LIST_NODE predecessor of a LIST_NODE `node`,
or NULL if `node` is null.
*/
extern LIST_NODE LIST_NODE_PREV(LIST_NODE node);
/* Create a new LIST_TYPE optionally initialized with elements copied from
`start` and until `end`.
If `end` is null it is assumed == `start` + 1.
If `start` is not NULL then elements will be appended to the
LIST_TYPE until `end` or until an element cannot be successfully copied.
The size of the LIST_TYPE will be the number of successfully copied
elements.
*/
extern LIST_TYPE LIST_NEW(LIST_ELEMENT_TYPE *start, LIST_ELEMENT_TYPE *end);
/* Dispose of a LIST_TYPE
If the pointer to LIST_TYPE `plist` is not null and addresses
a non-null LIST_TYPE then the LIST_TYPE it addresses is
destroyed and set NULL.
*/
extern void LIST_DISPOSE(LIST_TYPE * plist);
/* Copy the LIST_TYPE at `psrc` into the LIST_TYPE-sized region at `pdest`,
returning `pdest` on success, else NULL.
If copying is unsuccessful the LIST_TYPE-sized region at `pdest is
unchanged.
*/
extern LIST_TYPE * LIST_COPY_INIT(LIST_TYPE *pdest, LIST_TYPE *psrc);
/* Return a copy of the LIST_TYPE `src`, or NULL if `src` cannot be
successfully copied.
*/
extern LIST_TYPE LIST_COPY(LIST_TYPE src);
/* Return a LIST_NODE referring to the start of the
LIST_TYPE `list`, or NULL if `list` is null.
*/
extern LIST_NODE LIST_BEGIN(LIST_TYPE list);
/* Return a LIST_NODE referring to the end of the
LIST_TYPE `list`, or NULL if `list` is null.
*/
extern LIST_NODE LIST_END(LIST_TYPE list);
/* Return the number of LIST_ELEMENT_TYPEs in the LIST_TYPE `list`
or 0 if `list` is null.
*/
extern size_t LIST_SIZE(LIST_TYPE list);
/* Etc. etc. - extern prototypes for all API functions.
...
...
*/
/* If LIST_IMPLEMENT is defined then the implementation of LIST_TYPE is
compiled, otherwise skipped. #define LIST_IMPLEMENT to include this
file in the .c file that implements LIST_TYPE. Leave it undefined
to include this file in the .h file that defines the LIST_TYPE API.
*/
#ifdef LIST_IMPLEMENT
// Implementation code now included.
// Standard library #includes...?
// The heap structure of a list node
struct LIST_NODE_STRUCT {
struct LIST_NODE_STRUCT * _next;
struct LIST_NODE_STRUCT * _prev;
LIST_ELEMENT_TYPE _data[1];
};
// The heap structure of a LIST_TYPE
struct LIST_STRUCT {
size_t _size;
struct LIST_NODE_STRUCT * _anchor;
};
/* Etc. etc. - implementations for all API functions
...
...
*/
/* Undefine LIST_IMPLEMENT whenever it was defined.
Should never fall through.
*/
#undef LIST_IMPLEMENT
#endif // LIST_IMPLEMENT
/* Always undefine all the LIST_TYPE parameters.
Should never fall through.
*/
#undef LIST_ELEMENT_TYPE
#undef LIST_ELEMENT_COPY_INITOR
#undef LIST_ELEMENT_DISPOSE
/* Also undefine the "I really meant to include this" flag. */
#undef INCLUDE_LIST_TYPE_INL
Note that list_type.inl
has no macro-guard against mutliple inclusion. You want
at least some of it - at least the template API - to be included every time it is
seen.
If you read the comments at the top of the file you can guess how you would code
a wrapping header to import a list-of-int container type.
#ifndef LIST_INT_H
#define LIST_INT_H
/* list_int.h*/
#define LIST_ELEMENT_TYPE int
#define INCLUDE_LIST_TYPE_INL
#include "list_type.inl"
#endif
and likewise how you would code the wrapping header to import a list-of-list-of-int
container type:
#ifndef LIST_LIST_INT_H
#define LIST_LIST_INT_H
/* list_list_int.h*/
#define LIST_ELEMENT_TYPE list_int
#define LIST_ELEMENT_COPY_INIT list_int_copy_init
#define LIST_ELEMENT_DISPOSE list_int_dispose
#define INCLUDE_LIST_TYPE_INL
#include "list_type.inl"
#endif
Your applications can safely include such wrappers, e.g.
#include "list_int.h"
#include "list_list_int.h"
despite the fact the they define LIST_ELEMENT_TYPE
in conflicting ways because
list_type.inl
always #undefs
all the macros that parameterize the list-type
when it's done with them: see the last few lines of the file.
Note too the use of the macro LIST_IMPLEMENT
. If its undefined when list_type.inl
is parsed then only the template API is exposed; the template implementation is
skipped. If LIST_IMPLEMENT
is defined then the whole file is compiled. Thus our
wrapping headers, by not defining LIST_IMPLEMENT
, import only the list-type
API.
Conversely for our wrapping source files list_int.c
, list_list_int.c
, we will
define LIST_IMPLEMENT
. After that, there's nothing to do but include the
corresponding header:
/* list_int.c */
#define LIST_IMPLEMENT
#include "list_int.h"
and:
/* list_list_int.c*/
#include "list_int.h"
#define LIST_IMPLEMENT
#include "list_list_int.h"
Now in your application, no list-template macros appear. Your wrapping
headers parse out to "real code":
#include "list_int.h"
#include "list_list_int.h"
// etc.
int main(void)
{
int idata[10] = {1,2,3,4,5,6,7,8,9,10};
//...
list_int lint = list_int_new(idata,idata + 10);
//...
list_list_int llint = list_list_int_new(&lint,0);
//...
list_int_dispose(&lint);
//...
list_list_int_dispose(&llint);
//...
exit(0);
}
To equip yourself with a "C template library" this way the only (!) hard work
is to write the .inl
file for each container type you want and to test it
very, very thoroughly. You would then probably generate an object file
and header for each combination of native datatype and container type for
off-the-shelf linkage, and knock out the .h
and .c
wrappers in a jiffy for
other types on demand.
Needless to say, as soon as C++ sprouted templates my enthusiam for sweating
them out this way evaporated. But it can be done this way, completely
generically, if for some reason C is the only option.
You could always add a second argument to the DEFINE_LIST
macro that will allow you to "name" the list. For instance:
#define DEFINE_LIST(TYPE, NAME) \
struct _List_##TYPE_##NAME \
{ \
TYPE member_1; \
struct _List_##TYPE_##NAME* next; \
}
Then you could simply do:
DEFINE_LIST(int, my_list);
//... more code that uses the "my_list" type
You would just have to restrict yourself to not re-using the same list "name" when two different header files include each other, and both use the DEFINE_LIST
macro. You would also have to refer to the list by name when using LIST_CREATE
, etc.
When passing the lists to functions that you've written, you can always create "generic" types that the user-defined "named" versions are cast to. This shouldn't affect anything since the actual information in the struct
stays the same, and the "name" tag merely differentiates the types from a declaration rather than binary standpoint. For example, here is a function that takes list objects that store int
types:
#define GENERIC_LIST_PTR(TYPE) struct _generic_list_type_##TYPE*
#define LIST_CAST_PTR(OBJ, TYPE) (GENERIC_LIST_PTR(TYPE))(OBJ)
void function(GENERIC_LIST_PTR(INT) list)
{
//...use list as normal (i.e., access it's int data-member, etc.)
}
DEFINE_LIST(int, my_list);
int main()
{
LIST(int, my_list)* list = LIST_CREATE(int, my_list);
function(LIST_CAST_PTR(list, int));
//...more code
return 0;
}
I know this isn't necessarily the most convenient thing, but this does resolve the naming issues, and you can control what versions of struct _generic_list_type_XXX
are created in some private header file that other users won't be adding to (unless they wish to-do so for their own types) ... but it would be a mechanism for separating the declaration and the definition of the generic list-type from the actual user-defined list-type.
It is possible to create generic and type-safe containers with macros. From the viewpoint of the theory of computation, the language (code) generated from macro expansions can be recognized by a nondeterministic pushdown automata which means that it is at most a context-free grammar. The aforementioned statement makes our goal seems impossible to achieve since the container and its affiliated iterators should remember the type they contains, but this can only be done by a context-sensitive grammar. However, we can do some tricks!
The key to success lies in the compilation process, building symbol tables. If the type of variable can be recognized when compiler queries the table and no unsafe type casting occurs, then it is regarded as type-safe. Therefore, we have to give every struct
a special name because struct name may conflict if two or more structs are declared on the same level of scope. The easiest way is to append the current line number to the struct name. The standard C supports predefined macro __LINE__
and macro concatenation / stringification since ANSI C (C89/C90).
Then, what we have to do is to hide some attributes into the struct we defined as above. If you want to create another list record at run-time, put a pointer to itself in the struct will actually solve the problem. However, this is not enough. We might need an extra variable to store how many list records we allocate at run-time. This helps us figure out how to free the memory when the list is destroy explicitly by programmers. Also, we can take the advantage of __typeof__()
extension which is widely used in macro programming.
I am the author of the OpenGC3 which aims at building type-safe generic containers with macros, and here is a short and brief example of how this library works:
ccxll(int) list; // declare a list of type int
ccxll_init(list); // initialize the list record
for (int cnt = 8; cnt-- > 0; ) //
ccxll_push_back(list, rand()); // insert "rand()" to the end
ccxll_sort(list); // sort with comparator: XLEQ
CCXLL_INCR_AUTO(pnum, list) // traverse the list forward:
printf("num = %d\n", *pnum); // access elems through iters
ccxll_free(list); // destroy the list after use
It is quite similar to the syntax of the STL. The type of list is determined when list
is declared. We have no need to concern about the type safety because there is no unsafe type casting when operations are performed to the list.