String pointer and array of chars in c

2020-05-28 01:20发布

问题:

I just start learning C and found some confusion about the string pointer and string(array of char). Can anyone help me to clear my head a bit?

// source code
char name[10] = "Apple";
char *name2 = "Orange";

printf("the address of name: %p\n", name);
printf("the address of the first letter of name: %p\n", &(*name));
printf("first letter of name using pointer way: %c\n", *name);
printf("first letter of name using array way: %c\n", name[0]);
printf("---------------------------------------------\n");
printf("the address of name2: %p\n", name2);
printf("the address of the first letter of name2: %p\n", &(*name2));
printf("first letter of name2 using pointer way: %c\n", *name2);
printf("first letter of name2 using array way: %c\n", name2[0]);

// output
the address of name: 0x7fff1ee0ad50
the address of the first letter of name: 0x7fff1ee0ad50
first letter of name using pointer way: A
first letter of name using array way: A
---------------------------------------------
the address of name2: 0x4007b8
the address of the first letter of name2: 0x4007b8
first letter of name2 using pointer way: O
first letter of name2 using array way: O

so I assume that both name and name2 point to the address of their own first letter. then my confusion is(see the code below)

//code
char *name3; // initialize a char pointer
name3 = "Apple"; // point to the first letter of "Apple", no compile error

char name4[10]; // reserve 10 space in the memory
name4 = "Apple"; // compile errorrrr!!!!!!!!!!

I create a char pointer called name2 and name2 pointer to the first letter of "Apple" which is fine, then I create another array of char and allocate 10 space in the memory. and then try to use name4 which is an address points to the first letter of "Apple". As a result, I got a compile error.

I am so frustrated by this programming language. sometimes they works the same way. but sometimes they doesn't. Can anyone explain the reason and if I really want to create a string or array of chars in separated lines. how I can do that???

Many thanks...

回答1:

You can initialize an array when you declare it, like this:

int n[5] = { 0, 1, 2, 3, 4 };
char c[5] = { 'd', 'a', 't', 'a', '\0' };

But since we typically treat char arrays as strings, C allows a special case:

char c[5] = "data";  // Terminating null character is added.

However, once you've declared an array, you can't reassign it. Why? Consider an assignment like

char *my_str = "foo";  // Declare and initialize a char pointer.
my_str = "bar";        // Change its value.

The first line declares a char pointer and "aims" it at the first letter in foo. Since foo is a string constant, it resides somewhere in memory with all the other constants. When you reassign the pointer, you're assigning a new value to it: the address of bar. But the original string, foo remains unchanged. You've moved the pointer, but haven't altered the data.

When you declare an array, however, you aren't declaring a pointer at all. You're reserving a certain amount of memory and giving it a name. So the line

char c[5] = "data";

starts with the string constant data, then allocates 5 new bytes, calls them c, and copies the string into them. You can access the elements of the array exactly as if you'd declared a pointer to them; arrays and pointers are (for most purposes) interchangeable in that way.

But since arrays are not pointers, you cannot reassign them.

You can't make c "point" anywhere else, because it's not a pointer; it's the name of an area of memory.

You can change the value of the string, either one character at a time, or by calling a function like strcpy():

c[3] = 'e';       // Now c = "date", or 'd', 'a', 't', 'e', '\0'
strcpy(c, "hi");  // Now c = 'h', 'i', '\0', 'e', '\0'
strcpy(c, "too long!") // Error: overflows into memory we don't own.

Efficiency Tip

Note, also, that initializing an array generally makes a copy of the data: the original string is copied from the constant area to the data area, where your program can change it. Of course, this means you're using twice as much memory as you may have expected. You can avoid the copy and generally save memory by declaring a pointer instead. That leaves the string in the constant area and allocates only enough memory for a pointer, regardless of the length of the string.



回答2:

Although pointer and array seems familiar, they are different. the char *name3 is just a pointer to char, it takes no more memory than a pointer. it just store a address in it, so you can assign a string to it then the address stored in name3 change to the address of "Apple".

But your name4 is an array of char[10], it holds the memory of 10 chars, if you want to assign it, you need to use strcpy or something to write it's memory, but not assign it with an address like "Apple".



回答3:

You cannot directly reassign a value to an array type (e.g. your array of ten chars name4). To the compiler, name4 is an "array" and you cannot use the assignment = operator to write to an array with a string literal.

In order to actually move the content of the string "Apple" into the ten byte array you allocated for it (name4), you must use strcpy() or something of that sort.

What you are doing with name3 is pretty different. It is created as a char * and initialized to garbage, or zero (you don't know for sure at this point). Then, you assign into it the location of the static string "Apple". This is a string that lives in read-only memory, and attempting to write to the memory that the name3 pointer points to can never succeed.

Based on this, you can surmise that the last statement attempts to assign the memory location of a static string to something somewhere else that represents a collection of 10 chars. The language does not provide you with a pre-determined way to perform this task.

Herein lies its power.



回答4:

When you say

char *name3 = "Apple";

you are declaring name3 to point to the static string "Apple". If you're familiar with higher-level languages, you can think of this as immutable (I'm going to explain it in this context because it sounds like you've programmed before; for the technical rationale, check the C standard).

When you say

char name4[10];
name4 = "Apple";

you get an error because you first declare an array of 10 chars (in other words, you are 'pointing' to the start of a 10-byte section of mutable memory), and then attempt to assign the immutable value "Apple" to this array. In the latter case, the actual data allocation occurs in some read-only segment of memory.

This means that the types do not match:

error: incompatible types when assigning to type 'char[10]' from type 'char *'

If you want name4 to have the value "Apple", use strcpy:

strcpy(name4, "Apple");

If you want name4 to have the initial value "Apple", you can do that as well:

char name4[10] = "Apple"; // shorthand for {'A', 'p', 'p', 'l', 'e', '\0'}

The reason that this works, whereas your previous example does not, is because "Apple" is a valid array initialiser for a char[]. In other words, you are creating a 10-byte char array, and setting its initial value to "Apple" (with 0s in the remaining places).

This might make more sense if you think of an int array:

int array[3] = {1, 2, 3}; // initialise array

Probably the easiest colloquial explanation I can think of is that an array is a collection of buckets for things, whereas the static string "Apple" is a single thing 'over there'.

strcpy(name4, "Apple") works because it copies each of the things (characters) in "Apple" into name4 one by one.

However, it doesn't make sense to say, 'this collection of buckets is equal to that thing over there'. It only makes sense to 'fill' the buckets with values.



回答5:

I think this will help clear it up also:

int main() {
    char* ptr   = "Hello";
    char  arr[] = "Goodbye";

    // These both work, as expected:
    printf("%s\n", ptr);
    printf("%s\n", arr);
    printf("%s\n", &arr);   // also works!

    printf("ptr  = %p\n", ptr);
    printf("&ptr = %p\n", &ptr);
    printf("arr  = %p\n", arr);
    printf("&arr = %p\n", &arr);

    return 0;
 }

Output:

Hello
Goodbye
Goodbye
ptr  = 0021578C         
&ptr = 0042FE2C         
arr  = 0042FE1C         \__ Same!
&arr = 0042FE1C         /

So we see that arr == &arr. Since it's an array, the compiler knows that you are always going to want the address of the first byte, regardless of how it's used.

arr is an array of 7+1 bytes, that are on the stack of main(). The compiler generates instructions tho reserve those bytes, and then populate it with "Goodbye". There is no pointer!

ptr on the other hand, is a pointer, a 4-byte integer, also on the stack. That's why &ptr is very close to &arr. But what it points to, is a static string ("Hello"), that is off in the read-only section of the executable (which is why ptr's value is a very different number).