What does the 'array name' mean in case of

2019-01-01 07:30发布

问题:

In my code:

 char *str[] = {\"forgs\", \"do\", \"not\", \"die\"};
 printf(\"%d %d\", sizeof(str), sizeof(str[0]));  

I\'m getting the output as 12 2, so my doubts are:

  1. Why is there a difference?
  2. Both str and str[0] are char pointers, right?

回答1:

In most cases, an array name will decay to the value of the address of its first element, and with type being the same as a pointer to the element type. So, you would expect a bare str to have the value equal to &str[0] with type pointer to pointer to char.

However, this is not the case for sizeof. In this case, the array name maintains its type for sizeof, which would be array of 4 pointer to char.

The return type of sizeof is a size_t. If you have a C99 compiler, you can use %zu in the format string to print the value returned by sizeof.



回答2:

Through the question is already answered and accepted, but I am adding some more description (also answering the original question) that I guess will be helpful for new users. (as I searched, this description is not explained anywhere else (at-least on stackoverflow) hence I am adding now.

First read: sizeof Operator

6.5.3.4 The sizeof operator, 1125:
When you apply the sizeof operator to an array type, the result is the total number of bytes in the array.

According to this when sizeof is applied to the name of a static array identifier (not allocated through malloc), the result is the size in bytes of the whole array rather then just address. This is one of the few exceptions to the rule that the name of an array is converted/decay to a pointer to the first element of the array, and it is possible just because the actual array size is fixed and known at compile time, when sizeof operator evaluates.

To understand it better consider the code below:

#include<stdio.h>
int main(){
 char a1[6],       // One dimensional
     a2[7][6],     // Two dimensional 
     a3[5][7][6];  // Three dimensional

 printf(\" sizeof(a1)   : %lu \\n\", sizeof(a1));
 printf(\" sizeof(a2)   : %lu \\n\", sizeof(a2));
 printf(\" sizeof(a3)   : %lu \\n\", sizeof(a3));
 printf(\" Char         : %lu \\n\", sizeof(char));
 printf(\" Char[6]      : %lu \\n\", sizeof(char[6]));
 printf(\" Char[5][7]   : %lu \\n\", sizeof(char[7][6]));
 printf(\" Char[5][7][6]: %lu \\n\", sizeof(char[5][7][6]));

 return 1;
} 

Its output:

 sizeof(a1)   : 6 
 sizeof(a2)   : 42 
 sizeof(a3)   : 210 
 Char         : 1 
 Char[5]      : 6 
 Char[5][7]   : 42 
 Char[5][7][6]: 210 

Check above working at @codepad, notice size of char is one byte, it you replace char with int in above program then every output will be multiplied by sizeof(int) on your machine.

Difference between char* str[] and char str[][] and how both are stored in memory

Declaration-1: char *str[] = {\"forgs\", \"do\", \"not\", \"die\"};

In this declaration str[] is an array of pointers to char. Every index str[i] points to first char of strings in {\"forgs\", \"do\", \"not\", \"die\"};.
Logically str should be arranged in memory in following way:

Array Variable:                Constant Strings:
---------------                -----------------

         str:                       201   202   203   204  205   206
        +--------+                +-----+-----+-----+-----+-----+-----+
 343    |        |= *(str + 0)    | \'f\' | \'o\' | \'r\' | \'g\' | \'s\' | \'\\0\'|
        | str[0] |-------|        +-----+-----+-----+-----+-----+-----+
        | 201    |       +-----------▲
        +--------+                  502   503  504
        |        |                +-----+-----+-----+
 347    | str[1] |= *(str + 1)    | \'d\' | \'o\' | \'\\0\'|
        | 502    |-------|        +-----+-----+-----+
        +--------+       +-----------▲
        |        |                  43    44    45    46
 351    | 43     |                +-----+-----+-----+-----+
        | str[2] |= *(str + 2)    | \'n\' | \'o\' | \'t\' | \'\\0\'|
        |        |-------|        +-----+-----+-----+-----+
        +--------+       +-----------▲
 355    |        |
        | 9002   |                 9002  9003   9004 9005
        | str[3] |                +-----+-----+-----+-----+
        |        |= *(str + 3)    | \'d\' | \'i\' | \'e\' | \'\\0\'|
        +--------+       |        +-----+-----+-----+-----+
                         +-----------▲


Diagram: shows that str[i] Points to first char of each constant string literal. 
Memory address values are assumption.

Note: str[] is stored in continue memory allocations and every string is stored in memory at random address (not in continue space).

[ANSWER]

According to Codepad following code:

int main(int argc, char **argv){
    char *str[] = {\"forgs\", \"do\", \"not\", \"die\"};
    printf(\"sizeof(str): %lu,  sizeof(str[0]): %lu\\n\", 
            sizeof(str), 
            sizeof(str[0])
    );  
    return 0;
}

Output:

sizeof(str): 16,  sizeof(str[0]): 4
  • In this code str is an array for 4 char-addresses, where each char* is size 4 bytes, so according to above quote total size of array is 4 * sizeof(char*) = 16 bytes.

  • Datatype of str is char*[4].

  • str[0] is nothing but pointer to char, so its four bytes. Datetype of str[i] is char*.

(note: in some system address can be 2-byte or 8-bytes)

Regarding output one should also read glglgl\'s comment to the question:

On whatever architecture you are, the first value should be 4 times the second one. On a 32 bit machine, you should get 16 4, on a 64 bit one 32 8. On a very old one or on an embedded system, you might even get 8 2, but never 12 2 as the array contains 4 element of the same size

Additional points:

  • Because each str[i] points to a char* (and string) is variable, str[i] can be assigned a new string\'s address for example: str[i] = \"yournewname\"; is valid for i = 0 to < 4.

One more important point to notice:

  • In our above example str[i] pointing to constant string literal that can\'t be modified; hence str[i][j] = \'A\' is invalid (we can\'t write on read only memory) and doing this will be a runtime error.
    But suppose if str[i] points to a simple char array then str[i][j] = \'A\' can be a valid expression.
    Consider following code:

      char a[] = \"Hello\"; // a[] is simple array
      char *str[] = {\"forgs\", \"do\", \"not\", \"die\"};
      //str[0][4] = \'A\'; // is error because writing on read only memory
      str[0] = a;
      str[0][5] = \'A\'; // is perfectly valid because str[0] 
                       // points to an array (that is not constant)
    

Check here working code: Codepad

Declaration-2: char str[][6] = {\"forgs\", \"do\", \"not\", \"die\"};:

Here str is a two-dimensional array of chars (where each row is equal in size) of size 4 * 6. (remember here you have to give column value in declaration of str explicitly, but row is 4 because of number of strings are 4)
In memory str[][] will be something like below in diagram:

                    str
                    +---201---202---203---204---205---206--+
201                 | +-----+-----+-----+-----+-----+-----+|   
str[0] = *(str + 0)--►| \'f\' | \'o\' | \'r\' | \'g\' | \'s\' | \'\\0\'||
207                 | +-----+-----+-----+-----+-----+-----+|
str[1] = *(str + 1)--►| \'d\' | \'o\' | \'\\0\'| \'\\0\'| \'\\0\'| \'\\0\'||
213                 | +-----+-----+-----+-----+-----+-----+|
str[2] = *(str + 2)--►| \'n\' | \'o\' | \'t\' | \'\\0\'| \'\\0\'| \'\\0\'||
219                 | +-----+-----+-----+-----+-----+-----+|
str[3] = *(str + 3)--►| \'d\' | \'i\' | \'e\' | \'\\0\'| \'\\0\'| \'\\0\'||
                    | +-----+-----+-----+-----+-----+-----+|
                    +--------------------------------------+
  In Diagram:                                 
  str[i] = *(str + i) = points to a complete i-row of size = 6 chars. 
  str[i] is an array of 6 chars.

This arrangement of 2D array in memory is called Row-Major: A multidimensional array in linear memory is organized such that rows are stored one after the other. It is the approach used by the C programming language.

Notice differences in both diagrams.

  • In second case, complete two-dimensional char array is allocated in continue memory.
  • For any i = 0 to 2, str[i] and str[i + 1] value is different by 6 bytes (that is equals to length of one row).
  • Double boundary line in this diagram means str represents complete 6 * 4 = 24 chars.

Now consider similar code you posted in your question for 2-dimensional char array, check at Codepad:

int main(int argc, char **argv){
    char str[][6] = {\"forgs\", \"do\", \"not\", \"die\"};
    printf(\"sizeof(str): %lu,  sizeof(str[0]): %lu\\n\", 
            sizeof(str), 
            sizeof(str[0])
    );
    return 0;
}

Output:

sizeof(str): 24,  sizeof(str[0]): 6

According to the sizeof operator\'s treatment with array, On application of 2-d array size of should return the entire array size that is 24 bytes.

  • As we know, sizeof operator returns the size of the entire array on applying array name. So for sizeof(str) it returns = 24 that is size of complete 2D char array consists of 24 chars (6-cols* 4-rows).

  • In this declaration type of str is char[4][6].

  • One more interesting point is str[i] represents an array chats and it\'s type is char[6]. And sizeof(str[0]) is complete array\'s size = 6 (row length).

Additional points:

  • In second declaration str[i][j] is not constant, and its content can be changes e.g. str[i][j] = \'A\' is a valid operation.

  • str[i] is name of char array of type char[6] is a constant and assignment to str[i] e.g. str[i] = \"newstring\" is illegal operation (infect it will be compilation-time error).

One more important difference between two declarations:

In Declaration-1: char *str[] = {\"forgs\", \"do\", \"not\", \"die\"};, type of &str is char*(*)[4], its address of array of char pointers.

In Declaration-2: char str[][6] = {\"forgs\", \"do\", \"not\", \"die\"};, type of &str is char(*)[4][6], its address of 2-D char array of 4 rows and 6 cols.

If one wants to read similar description for 1-D array: What does sizeof(&array) return?



回答3:

It\'s 16 4 on my computer, and I can explain this: str is an array of char*, so sizeof(str)==sizeof(char*)*4

I don\'t know why you get 12 2 though.



回答4:

The two pointers are different. str is an array of char pointers, in your example is a (char*[4]), and the str[0] is a char pointer.

The first sizeof returns the size of the four char pointers that contains, and the second returns the sizeof of the char*.
In my tests the results are:

sizeof(str[0]) = 4   // = sizeof(char*)

sizeof(str) = 16  
            = sizeof(str[0]) + sizeof(str[1]) + sizeof(str[2]) + sizeof(str[3])
            = 4 * sizeof(char*)  
            = 4 * 4
            = 16