VC++ function string::c_str(): the address of the

2019-09-09 10:56发布

问题:

I met a strange problem when trying to get the result of a string’s function c_str() whose result is inconsistent with g++.

There is a function called Test to return a string instance. And I want to use a char* type to store the result (it’s needed). As you can see the function is simple return a string “resultstring”. But when I try to get the result something strange happened.

The result I got is “” in part two. The part one and part three both return the “resultstring”. While that’s in Visual Studio. The three part of the same code compiled with g++ both return the “result string. Let’s just as well see the result first:

result of vs:

address:16841988

resultstring

address:16842096

"here is a empty line"

address:16842060

address:16842144

address:16842396

address:16842396

resultstring

result of g++

address:5705156

resultstring

address:5705156

resultstring

address:5705156

address:5705196

address:5705156

address:5705156

resultstring

The code is very simple list below:

#include <iostream>
#include <string>
using namespace std;

string Test()
{
     char a[64] = "resultstring";
     return  string(a);
}
int main(void)
{
    //part one
    cout << "address:"<< (unsigned)Test().c_str() << endl;
    cout << Test().c_str() << endl;

    //part two
    char *j  = const_cast<char*>(Test().c_str());
    cout << "address:"<< (unsigned)Test().c_str() << endl;

    cout << j << endl;
    cout << "address:" << (unsigned)j <<endl;

    //part three
    string h3 = Test();
    char* j2 = const_cast<char*>(h3.c_str());
    cout << "address:"<< (unsigned)Test().c_str() << endl;
    cout << "address:"<< (unsigned)h3.c_str() << endl;

    cout << "address:" << (unsigned)j2 <<endl;
    cout << j2 <<endl;
    getchar();
    return 0;

}


Now I have three questions.

1st, why the result complied by g++ returns all resultstring while the result of Visual Studio returns all resultstring except for variable j? If you debug into this you’ll find that VC++ only set the address of j2 like 00 65 73 75 … which is esultstring with a 00 begin address. And it is not strange that we’ll get “”. It’s just like char* str = "\0something else" you’ll always get "". But the question is why does this happen only with j?

2nd, why does one of the addresses of the (unsigned) Test ().c_str() is different with others? If we remove the line string h3 = Test () the address will be all the same.

3rd, Is it the “correct” behavior of Visual Studio returning “” value of variable j? why it is different with g++?

Looking forward to your replies.

Regards, Kevin

回答1:

You have undefined behavior. The std::string returned by Test() is a temporary and the pointer returned by c_str() (stored in j) is no longer valid after the lifetime of the temporary ends. This means that anything can happen. The array the pointer points to may contain garbage, it may be the original string or the implementation may have null terminated the beginning of it. Accessing it may cause a segmentation fault or it may allow you to access the original string data. This can and usually does vary between different compilers and implementations of the standard library.

char *j  = const_cast<char*>(Test().c_str());
// The contents pointed to by j are no longer valid and access that content 
// is undefined behavior
cout << "address:"<< (unsigned)Test().c_str() << endl;

The address is different between calls to Test() because it returns a temporary each time you call it. Some compilers may optimize this and/or the allocation of data may get the same block of memory but it is not guaranteed to be the same.



回答2:

This is totally flawed. You create and destroy a temporary string every time you call Test(). Any attempt to access memory using pointer returned by Test().c_str() after temporary was destroyed makes no sense - memory was freed already. It MIGHT have the old values (if nothing is written there before the access), but it might have anything as well (if it is reused before the access). It's Undefined Behavior.

In case of VC++ it is overwritten once and is not in other cases. With GCC - it's never overwritten. But this is pure chance. Once again - it's UB.