How is an array stored in memory in this program?

2019-06-01 01:51发布

How is an array stored in memory in this program? What happened here? How to understand this behaviour in c?(is it undefine/unspecified/implementation behaviour).

#include <stdio.h>

int main()
{
char a[5] = "world";
char b[16] = "haii how are you";

printf("string1 %s\nstring2 %s\n", a, b);

return 0;
}

output:

user@toad:~$ gcc -Wall simple.c
user@toad:~$ ./a.out 
string1 world
string2 haii how are youworld
user@toad:~$ 

but it is work fine.

char a[5] = "world";
char b[17] = "haii how are you?"; 

6条回答
冷血范
2楼-- · 2019-06-01 02:01

How is an array stored in memory in this program?

First notice that your program is undefined behavior as you call printf with a char array and not a string since there isn't room for the zero-termination in the two arrays. For instance you only reserve 5 chars for a and world takes up all 5, i.e. no room for the termination.

A strict person would say that due to UB it makes no sense to speculate about what is going on - with UB anything can happen.

But if we do it anyway then it is likely as described below.

The answer would depend on your system as c doesn't specify all aspects of storing data. It is specified that an array must be in contiguous memory but exactly how and where that memory is located, is beyond the standard.

From the output you have, it seems that your system have located it like this:

haii how are youworld
^               ^
b               a

You can't know what is after the last d.

However, when you print a you get the output world which tells us that "by luck" there is a '\0' just after the last d.

haii how are youworld'\0'
^               ^      ^
b               a      "luck"

So printing a will give world and printing b will give haii how are youworld.

Your code should be:

char a[6] = "world";
char b[17] = "haii how are you";

to make room for the termination of each string and so that your memory layout would be

haii how are you'\0'world'\0'
^                   ^      
b                   a

Notice: The '\0' that you got "by luck" is probably because your system initializes all memory assigned to your program to zero at start up.

查看更多
地球回转人心会变
3楼-- · 2019-06-01 02:11

The code is not correct. Some compilers do not want to compile it:

> clang++ test.cxx 
test.cxx:5:14: error: initializer-string for char array is too long
        char a[5] = "world";
                    ^~~~~~~
1 error generated.

Maybe your compiler just ignores array size, assigning the address of string constant to it and keeping null character at the end.

查看更多
Juvenile、少年°
4楼-- · 2019-06-01 02:11

As already mentioned by others, while the array declaration itself is completely conformant (you declare an array of chars, not a string):

An array of character type may be initialized by a character string literal, optionally enclosed in braces. Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.

(Section 6.7.8/14 of C99 standard n1256.pdf; thanks to @Michi for pointing out the paragraph)

Trying to print these using %s is undefined though. However you can specify the length of characters to print in the format string (%.5s) -- this way you'd be ok again.


Concerning the memory layout: C does not make many promises about how the memory is laid out actually. The only thing I can find is:

An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type. 36) Array types are characterized by their element type and by the number of elements in the array. An array type is said to be derived from its element type, and if its element type is T, the array type is sometimes called ‘‘array of T’’. The construction of an array type from an element type is called ‘‘array type derivation’’.

(Section 6.2.5/20 of C99 standard n1256.pdf)


Note however, that in C++ even the code

char test[5] = "12345";

is illegal:

There shall not be more initializers than there are array elements. [ Example:

char cv[4] = "asdf";    // error

is ill-formed since there is no space for the implied trailing ’\0’. — end example ]

(Section 8.5.2/2 of C++ 14 standard n4296.pdf)

查看更多
欢心
5楼-- · 2019-06-01 02:15

Both of the string in the first snippet is not null terminated. They are just character arrays, not null terminated string literals.
printf with %s specifier expects a null terminated string as its argument. Passing wrong type of argument will invoke undefined behavior.

printf write the string to the standard output till it encounters a '\0' character. In case of absence of '\0' it will read past the array. Since a and b are not null terminated, it could be the case that after writing b to the terminal printf continues to search for '\0' and it founds it after the string a.

查看更多
Fickle 薄情
6楼-- · 2019-06-01 02:17

As per the C11 standard (6.7.9 Initialization).

    EXAMPLE 8
The declaration
char s[] = "abc", t[3] = "abc";

defines ‘‘plain’’ char array objects s and t whose elements are initialized with character string literals.

This declaration is identical to
char s[] = { 'a', 'b', 'c', '\0' },
t[] = { 'a', 'b', 'c' };

The contents of the arrays are modifiable. On the other hand, the declaration
char *p = "abc";
defines p with type ‘‘pointer to char’’ and initializes it to point to an object with type ‘‘array of char’’
with length 4 whose elements are initialized with a character string literal. If an attempt is made to use p to
modify the contents of the array, the behavior is undefined.

As per this for you

char a[6] = "world";
char b[17] = "haii how are you";

At the end of the "world" and "haii how are you" the '\0' is not added. So while using printf it searches for '\0' and prints both 'a' and 'b'.

查看更多
走好不送
7楼-- · 2019-06-01 02:19
char a[5] = "world";

World has 5 characters inside and if you initialize it, you must indicate that your string is ended by null character --> \0

If you define it like that

char a[6] = "world";

Then compiler put the null character at the end for you.

For your question, most compilers don't allow you define char a[5] = "world" but it seems that your memory is allocated like this

[ h ] [ a ] [ i ] [ i ] [ ] [ h ] [ o ] [ w ] [ ] [ a ] [ r ] [ e ] [ ] [ y ] [ o ] [ u ] [ w ] [ o ] [ r ] [ l ] [ d ] [ \0 ]

Then last point you must know that %s prints the character set until it reaches null character --> \0

查看更多
登录 后发表回答