Triple pointers in C: is it a matter of style?

2019-02-02 17:07发布

I feel like triple pointers in C are looked at as "bad". For me, it makes sense to use them at times.

Starting from the basics, the single pointer has two purposes: to create an array, and to allow a function to change its contents (pass by reference):

char *a;
a = malloc...

or

void foo (char *c); //means I'm going to modify the parameter in foo.
{ *c = 'f'; }

char a;
foo(&a);

The double pointer can be a 2D array (or array of arrays, since each "column" or "row" need not be the same length). I personally like to use it when I need to pass a 1D array:

void foo (char **c); //means I'm going to modify the elements of an array in foo.
{ (*c)[0] = 'f'; }

char *a;
a = malloc...
foo(&a);

To me, that helps describe what foo is doing. However, it is not necessary:

void foo (char *c); //am I modifying a char or just passing a char array?
{ c[0] = 'f'; }

char *a;
a = malloc...
foo(a);

will also work.

According to the first answer to this question, if foo were to modify the size of the array, a double pointer would be required.

One can clearly see how a triple pointer (and beyond, really) would be required. In my case if I were passing an array of pointers (or array of arrays), I would use it. Evidently it would be required if you are passing into a function that is changing the size of the multi-dimensional array. Certainly an array of arrays of arrays is not too common, but the other cases are.

So what are some of the conventions out there? Is this really just a question of style/readability combined with the fact that many people have a hard time wrapping their heads around pointers?

5条回答
该账号已被封号
2楼-- · 2019-02-02 17:29

Using triple+ pointers is harming both readability and maintainability.

Let's suppose you have a little function declaration here:

void fun(int***);

Hmmm. Is the argument a three-dimensional jagged array, or pointer to two-dimensional jagged array, or pointer to pointer to array (as in, function allocates an array and assigns a pointer to int within a function)

Let's compare this to:

void fun(IntMatrix*);

Surely you can use triple pointers to int to operate on matrices. But that's not what they are. The fact that they're implemented here as triple pointers is irrelevant to the user.

Complicated data structures should be encapsulated. This is one of manifest ideas of Object Oriented Programming. Even in C, you can apply this principle to some extent. Wrap the data structure in a struct (or, very common in C, using "handles", that is, pointers to incomplete type - this idiom will be explained later in the answer).

Let's suppose that you implemented the matrices as jagged arrays of double. Compared to contiguous 2D arrays, they are worse when iterating over them (as they don't belong to a single block of contiguous memory) but allow for accessing with array notation and each row can have different size.

So now the problem is you can't change representations now, as the usage of pointers is hard-wired over user code, and now you're stuck with inferior implementation.

This wouldn't be even a problem if you encapsulated it in a struct.

typedef struct Matrix_
{
    double** data;
} Matrix;

double get_element(Matrix* m, int i, int j)
{
    return m->data[i][j];
}

simply gets changed to

typedef struct Matrix_
{
    int width;
    double data[]; //C99 flexible array member
} Matrix;

double get_element(Matrix* m, int i, int j)
{
    return m->data[i*m->width+j];
}

The handle technique works like this: in the header file, you declare a incomplete struct and all the functions that work on the pointer to the struct:

// struct declaration with no body. 
struct Matrix_;
// optional: allow people to declare the matrix with Matrix* instead of struct Matrix*
typedef struct Matrix_ Matrix;

Matrix* create_matrix(int w, int h);
void destroy_matrix(Matrix* m);
double get_element(Matrix* m, int i, int j);
double set_element(Matrix* m, double value, int i, int j);

in the source file you declare the actual struct and define all the functions:

typedef struct Matrix_
{
    int width;
    double data[]; //C99 flexible array member
} Matrix;

double get_element(Matrix* m, int i, int j)
{
    return m->data[i*m->width+j];
}

/* definition of the rest of the functions */

The rest of the world doesn't know what does the struct Matrix_ contain and it doesn't know the size of it. This means users can't declare the values directly, but only by using pointer to Matrix and the create_matrix function. However, the fact that the user doesn't know the size means the user doesn't depend on it - which means we can remove or add members to struct Matrix_ at will.

查看更多
【Aperson】
3楼-- · 2019-02-02 17:41

Unfortunately you misunderstood the concept of pointer and arrays in C. Remember that arrays are not pointers.

Starting from the basics, the single pointer has two purposes: to create an array, and to allow a function to change its contents (pass by reference):

When you declare a pointer, then you need to initialize it before using it in the program. It can be done either by passing address of a variable to it or by dynamic memory allocation.
In latter, pointer can be used as indexed arrays (but it is not an array).

The double pointer can be a 2D array (or array of arrays, since each "column" or "row" need not be the same length). I personally like to use it when I need to pass a 1D array:

Again wrong. Arrays are not pointers and vice-versa. A pointer to pointer is not the 2D array.
I would suggest you to read the c-faq section 6. Arrays and Pointers.

查看更多
狗以群分
4楼-- · 2019-02-02 17:42

Most of the time, the use of 3 levels of indirection is a symptom by bad design decisions made elsewhere in the program. Therefore it is regarded as bad practice and there are jokes about "three star programmers" where, unlike the the rating for restaurants, more stars means worse quality.

The need for 3 levels of indirection often originates from the confusion about how to properly allocate multi-dimensional arrays dynamically. This is often taught incorrectly even in programming books, partially because doing it correctly was burdensome before the C99 standard. My Q&A post Correctly allocating multi-dimensional arrays addresses that very issue and also illustrates how multiple levels of indirection will make the code increasingly hard to read and maintain.

Though as that post explains, there are some situations where a type** might make sense. A variable table of strings with variable length is such an example. And when that need for type** arises, you might soon be tempted to use type***, because you need to return your type** through a function parameter.

Most often this need arises in a situation where you are designing some manner of complex ADT. For example, lets say that we are coding a hash table, where each index is a 'chained' linked list, and each node in the linked list an array. The proper solution then is to re-design the program to use structs instead of multiple levels of indirection. The hash table, linked list and array should be distinct types, autonomous types without any awareness of each other.

So by using proper design, we will avoid the multiple stars automatically.


But as with every rule of good programming practice, there are always exceptions. It is perfectly possible to have a situation like:

  • Must implement an array of strings.
  • The number of strings is variable and may change in run-time.
  • The length of the strings is variable.

You can implement the above as an ADT, but it may also be valid reasons to keep things simple and just use a char* [n]. You then have two options to allocate this dynamically:

char* (*arr_ptr)[n] = malloc( sizeof(char*[n]) );

or

char** ptr_ptr = malloc( sizeof(char*[n]) );

The former is more formally correct, but also cumbersome. Because it has to be used as (*arr_ptr)[i] = "string";, while the alternative can be used as ptr_ptr[i] = "string";.

Now suppose we have to place the malloc call inside a function and the return type is reserved for an error code, as is custom with C APIs. The two alternatives will then look like this:

err_t alloc_arr_ptr (size_t n, char* (**arr)[n])
{
  *arr = malloc( sizeof(char*[n]) );

  return *arr == NULL ? ERR_ALLOC : OK;
}

or

err_t alloc_ptr_ptr (size_t n, char*** arr)
{
  *arr = malloc( sizeof(char*[n]) );

  return *arr == NULL ? ERR_ALLOC : OK;
}

It is quite hard to argue and say that the former is more readable, and it also comes with the cumbersome access needed by the caller. The three star alternative is actually more elegant, in this very specific case.

So it does us no good to dismiss 3 levels of indirection dogmatically. But the choice to use them must be well-informed, with an awareness that they may create ugly code and that there are other alternatives.

查看更多
做个烂人
5楼-- · 2019-02-02 17:48

After two levels of indirection, comprehension becomes difficult. Moreover if the reason you're passing these triple (or more) pointers into your methods is so that they can re-allocate and re-set some pointed-to memory, that gets away from the concept of methods as "functions" that just return values and don't affect state. This also negatively affects comprehension and maintainability beyond some point.

But more fundamentally, you've hit upon one of the main stylistic objections to the triple pointer right here:

One can clearly see how a triple pointer (and beyond, really) would be required.

It's the "and beyond" that is the issue here: once you get to three levels, where do you stop? Surely it's possible to have an aribitrary number of levels of indirection. But it's better to just have a customary limit someplace where comprehensibility is still good but flexibility is adequate. Two's a good number. "Three star programming", as it's sometimes called, is controversial at best; it's either brilliant, or a headache for those who need to maintain the code later.

查看更多
何必那么认真
6楼-- · 2019-02-02 17:51

So what are some of the conventions out there? Is this really just a question of style/readability combined with the fact that many people have a hard time wrapping their heads around pointers?

Multiple indirection is not bad style, nor black magic, and if you're dealing with high-dimension data then you're going to be dealing with high levels of indirection; if you're really dealing with a pointer to a pointer to a pointer to T, then don't be afraid to write T ***p;. Don't hide pointers behind typedefs unless whoever is using the type doesn't have to worry about its "pointer-ness". For example, if you're providing the type as a "handle" that gets passed around in an API, such as:

typedef ... *Handle;

Handle h = NewHandle();
DoSomethingWith( h, some_data );
DoSomethingElseWith( h, more_data );
ReleaseHandle( h );

then sure, typedef away. But if h is ever meant to be dereferenced, such as

printf( "Handle value is %d\n", *h );

then don't typedef it. If your user has to know that h is a pointer to int1 in order to use it properly, then that information should not be hidden behind a typedef.

I will say that in my experience I haven't had to deal with higher levels of indirection; triple indirection has been the highest, and I haven't had to use it more than a couple of times. If you regularly find yourself dealing with >3-dimensional data, then you'll see high levels of indirection, but if you understand how pointer expressions and indirection work it shouldn't be an issue.


1. Or a pointer to pointer to int, or pointer to pointer to pointer to pointer to struct grdlphmp, or whatever.

查看更多
登录 后发表回答