Making a dynamic array that accepts any type in C

2020-04-02 02:27发布

问题:

I'm trying to find a way to make a struct to hold a dynamic array that can work with any data type (Including user defined data types), so far this is what I came up with.

#define Vector(DATATYPE) struct {   DATATYPE* data; size_t size; size_t used; }

typedef Vector(int) int_Vector;

int main(int argc, char* argv[]){
    int_Vector vec;
    return 0;
}

While this works I was wondering, is this good practice? Should I be doing something like this or is there a better method? Also is there a method to implement this with out the typedef Vector(int) int_vector part. Basically a way that would enable me to use the array the same way c++ uses templates where it would look something like this:

#define Vector(DATATYPE) struct {   DATATYPE* data; size_t size; size_t used; }

int main(int argc, char* argv[]){
    Vector(int) vec;
    return 0;
}

Mainly to avoid so many typedefs and have it all under one name.

回答1:

Well no, C doesn't have a template system so you can't use one.

You can mimic the effects with macros like you did (pretty clever solution) but that's of course a bit non-standard and requires users of your code to learn the macro and its limitations.

Normally C code doesn't try, since it's so awkward.

The most "generic" typical vector is something like glib's GArray, but that doesn't pretend to know the type of each element. Instead that is left to the user to care about when accessing, and the array just models each element as being n bytes.

There's _Generic() in C11 which might help a bit, I'm honestly not very experienced in that.



回答2:

The second example won't work because the two variable are defined as distinct types even though their members are the same. Why is this so, is covered in my existing answer.

However the syntax can be kept the same using a slightly different approach:

#include <stdlib.h>

#define vector(type)    struct vector_##type

struct vector_int
{
    int* array;
    size_t count;
} ;

int main(void)
{
    vector(int) one = { 0 };
    vector(int) two = { 0 };

    one = two;
    ( void )one ;

    return 0;
}

The usage surprisingly similar to C++'s vector<int> and a full example can be seen here:

#include <stdlib.h>

#define vector_var(type)    struct vector_##type

struct vector_int
{
    int* array;
    size_t count;
};

void vector_int_Push( struct vector_int* object , int value ) 
{
    //implement it here
}

int vector_int_Pop( struct vector_int* object ) 
{
    //implement it here
    return 0;
}    

struct vector_int_table
{
    void( *Push )( struct vector_int* , int );
    int( *Pop )( struct vector_int* );

} vector_int_table = { 
                         .Push = vector_int_Push ,
                         .Pop = vector_int_Pop 
                     };

#define vector(type)   vector_##type##_table

int main(void)
{
    vector_var(int) one = { 0 };
    vector_var(int) two = { 0 };

    one = two;

    vector(int).Push( &one , 1 );
    int value = vector(int).Pop( &one );
    ( void )value;

    return 0;
}


回答3:

Vector(DATATYPE) struct { DATATYPE* data; size_t size; size_t used; } also fails for pointers to functions.

void* is sufficient and well defined for a pointer to any object, but not so for a pointer to a function.

C does allow a pointer to a function of one type to be saved as a pointer to a function of another type. By using a union of the two below, code has enough space for saving the pointer to any type. The management of what type and what member used remains open.

union u_ptr {
  void *object;
  void (*function)();
}


回答4:

Not bad. And I don't see any disadvantage. Just to explain another method, mostly common used in this case use union:

typedef union { int i; long l; float f; double d; /*(and so on)*/} vdata;
typedef enum  {INT_T,LONG_T,FLOAT_T, /*(and so on)*/} vtype;
typedef struct 
{
    vtype t;
    vdata data
} vtoken;
typedef struct
{
    vtoken *tk;
    size_t sz;
   size_t n;
} Vector;

So this is possible way. The enum of datatype, you can avoid with typedefs, but if you use mixed (ex: sum long, to double, to float and so on) you must use them, since int + double is not equal to double+int; This is also a reason, because is more easy to see unions do this job. You leave al the arithmetic rules untouched.



回答5:

Expanding this answer regarding a polymorphism solution, we can as well make it include pointer types or user-defined types. The major advantage with this method is to get rid of the "data type" enum and with it all the run-time checking switch statements.

variant.h

#ifndef VARIANT_H
#define VARIANT_H

#include <stdio.h>
#include <stdint.h>

typedef void print_data_t (const void* data);
typedef void print_type_t (void);

typedef struct 
{
  void* data;
  print_data_t* print_data;
  print_type_t* print_type;
} variant_t;

void print_data_char    (const void* data);
void print_data_short   (const void* data);
void print_data_int     (const void* data);
void print_data_ptr     (const void* data);
void print_data_nothing (const void* data);

void print_type_char        (void);
void print_type_short       (void);
void print_type_int         (void);
void print_type_int_p       (void);
void print_type_void_p      (void);
void print_type_void_f_void (void);

void print_data (const variant_t* var);
void print_type (const variant_t* var);

#define variant_init(var) {                \
  .data = &var,                            \
                                           \
  .print_data = _Generic((var),            \
    char:  print_data_char,                \
    short: print_data_short,               \
    int:   print_data_int,                 \
    int*:  print_data_ptr,                 \
    void*: print_data_ptr,                 \
    void(*)(void): print_data_nothing),    \
                                           \
  .print_type = _Generic((var),            \
    char:  print_type_char,                \
    short: print_type_short,               \
    int:   print_type_int,                 \
    int*:  print_type_int_p,               \
    void*: print_type_void_p,              \
    void(*)(void): print_type_void_f_void) \
}


#endif /* VARIANT_H */

variant.c

#include "variant.h"

void print_data_char    (const void* data) { printf("%c",  *(const char*)  data); }
void print_data_short   (const void* data) { printf("%hd", *(const short*) data); }
void print_data_int     (const void* data) { printf("%d",  *(const int*)   data); }
void print_data_ptr     (const void* data) { printf("%p",  data); }
void print_data_nothing (const void* data) {}

void print_type_char        (void) { printf("char");          }
void print_type_short       (void) { printf("short");         }
void print_type_int         (void) { printf("int");           }
void print_type_int_p       (void) { printf("int*");          }
void print_type_void_p      (void) { printf("void*");         }
void print_type_void_f_void (void) { printf("void(*)(void)"); }


void print_data (const variant_t* var)
{
  var->print_data(var->data);
}

void print_type (const variant_t* var)
{
  var->print_type();
}

main.c

#include <stdio.h>
#include "variant.h"

int main (void) 
{
  char c = 'A';
  short s = 3;
  int i = 5;
  int* iptr = &i;
  void* vptr= NULL;
  void (*fptr)(void) = NULL;

  variant_t var[] =
  {
    variant_init(c),
    variant_init(s),
    variant_init(i),
    variant_init(iptr),
    variant_init(vptr),
    variant_init(fptr)
  };

  for(size_t i=0; i<sizeof var / sizeof *var; i++)
  {
    printf("Type: ");
    print_type(&var[i]);
    printf("\tData: ");
    print_data(&var[i]);
    printf("\n");
  }

  return 0;
}

Output:

Type: char      Data: A
Type: short     Data: 3
Type: int       Data: 5
Type: int*      Data: 000000000022FD98
Type: void*     Data: 000000000022FDA0
Type: void(*)(void)     Data:

Disadvantages with _Generic for this purpose is that it blocks us from using private encapsulation, since it has to be used as a macro in order to pass on type information.

On the other hand, the "variant" in this case has to be maintained for all new types one comes up with, so it isn't all that practical or generic.

Still these tricks are good to know for various similar purposes.