C/C++ int[] vs int* (pointers vs. array notation).

2020-01-23 17:06发布

问题:

I know that arrays in C are just pointers to sequentially stored data. But what differences imply the difference in notation [] and *. I mean in ALL possible usage context. For example:

char c[] = "test";

if you provide this instruction in a function body it will allocate the string on a stack while

char* c = "test";

will point to a data (readonly) segment.

Can you list all the differences between these two notations in ALL usage contexts to form a clear general view.

回答1:

According to the C99 standard:

An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type.

36) Array types are characterized by their element type and by the number of elements in the array. An array type is said to be derived from its element type, and if its element type is T, the array type is sometimes called array of T. The construction of an array type from an element type is called array type derivation.

A pointer type may be derived from a function type, an object type, or an incomplete type, called the referenced type. A pointer type describes an object whose value provides a reference to an entity of the referenced type. A pointer type derived from the referenced type T is sometimes referred to as a pointer to T. The construction of a pointer type from a referenced type is called pointer type derivation.

According to the standard declarations…

char s[] = "abc", t[3] = "abc";
char s[] = { 'a', 'b', 'c', '\0' }, t[] = { 'a', 'b', 'c' };

…are identical. The contents of the arrays are modifiable. On the other hand, the declaration…

const char *p = "abc";

…defines p with the type as pointer to constant char and initializes it to point to an object with type constant array of char (in C++) with length 4 whose elements are initialized with a character string literal. If an attempt is made to use p to modify the contents of the array, the behavior is undefined.

According to 6.3.2.1 Array subscripting dereferencing and array subscripting are identical:

The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))).

The differences of arrays vs. pointers are:

  • pointer has no information of the memory size behind it (there is no portable way to get it)
  • an array of incomplete type cannot be constructed
  • a pointer type may be derived from a an incomplete type
  • a pointer can define a recursive structure (this one is the consequence of the previous two)

These links may be useful to the subject:

  • http://support.microsoft.com/kb/44463
  • http://www.cplusplus.com/forum/articles/9/


回答2:

char c[] = "test";

This will create an array containing the string test so you can modify/change any character, say

c[2] = 'p';

but,

char * c = "test"

It is a string literal -- it's a const char.
So doing any modification to this string literal gives us segfault. So

c[2] = 'p';

is illegal now and gives us segfault.



回答3:

char [] denotes the type "array of unknown bound of char", while char * denotes the type "pointer to char". As you've observed, when a definition of a variable of type "array of unknown bound of char" is initialised with a string literal, the type is converted to "array[N] of char" where N is the appropriate size. The same applies in general to initialisation from array aggregate:

int arr[] = { 0, 1, 2 };

arr is converted to type "array[3] of int".

In a user-defined type definition (struct, class or union), array-of-unknown-bound types are prohibited in C++, although in some versions of C they are allowed as the last member of a struct, where they can be used to access allocated memory past the end of the struct; this usage is called "flexible arrays".

Recursive type construction is another difference; one can construct pointers to and arrays of char * (e.g. char **, char (*)[10]) but this is illegal for arrays of unknown bound; one cannot write char []* or char [][10] (although char (*)[] and char [10][] are fine).

Finally, cv-qualification operates differently; given typedef char *ptr_to_char and typedef char array_of_unknown_bound_of_char[], cv-qualifiying the pointer version will behave as expected, while cv-qualifying the array version will migrate the cv-qualification to the element type: that is, const array_of_unknown_bound_of_char is equivalent to const char [] and not the fictional char (const) []. This means that in a function definition, where array-to-pointer decay operates on the arguments prior to constructing the prototype,

void foo (int const a[]) {
    a = 0;
}

is legal; there is no way to make the array-of-unknown-bound parameter non-modifiable.



回答4:

The whole lot becomes clear if you know that declaring a pointer variable does not create the type of variable, it points at. It creates a pointer variable.

So, in practice, if you need a string then you need to specify an array of characters and a pointer can be used later on.



回答5:

Actually arrays are equivalent to constant pointers.

Also, char c[] allocates memory for the array, whose base address is c itself. No separate memory is allocated for storing that address.

Writing char *c allocates memory for the string whose base address is stored in c. Also, a separate memory location is used to store c.