C++: Is “my text” a std::string, a *char or a c-st

2019-04-27 13:49发布

问题:

I have just done what appears to be a common newbie mistake:

First we read one of many tutorials that goes like this:

 #include <fstream>
 int main() {
      using namespace std;
      ifstream inf("file.txt");
      // (...)
 }  

Secondly, we try to use something similar in our code, which goes something like this:

#include <fstream>
int main() {
    using namespace std;
    std::string file = "file.txt"; // Or get the name of the file 
                                   // from a function that returns std::string.
    ifstream inf(file);
    // (...)
}

Thirdly, the newbie developer is perplexed by some cryptic compiler error message.

The problem is that ifstream takes const * char as a constructor argument.

The solution is to convert std::string to const * char.

Now, the real problem is that, for a newbie, "file.txt" or similar examples given in almost all the tutorials very much looks like a std::string.

So, is "my text" a std::string, a c-string or a *char, or does it depend on the context?

Can you provide examples on how "my text" would be interpreted differently according to context?

[Edit: I thought the example above would have made it obvious, but I should have been more explicit nonetheless: what I mean is the type of any string enclosed within double quotes, i.e. "myfilename.txt", not the meaning of the word 'string'.]

Thanks.

回答1:

So, is "string" a std::string, a c-string or a *char, or does it depend on the context?

  • Neither C nor C++ have a built-in string data type, so any double-quoted strings in your code are essentially const char * (or const char [] to be exact). "C string" usually refers to this, specifically a character array with a null terminator.
  • In C++, std::string is a convenience class that wraps a raw string into an object. By using this, you can avoid having to do (messy) pointer arithmetic and memory reallocations by yourself.
  • Most standard library functions still take only char * (or const char *) parameters.
  • You can implicitly convert a char * into std::string because the latter has a constructor to do that.
  • You must explicitly convert a std::string into a const char * by using the c_str() method.

Thanks to Clark Gaebel for pointing out constness, and jalf and GMan for mentioning that it is actually an array.



回答2:

"myString" is a string literal, and has the type const char[9], an array of 9 constant char. Note that it has enough space for the null terminator. So "Hi" is a const char[3], and so forth.

This is pretty much always true, with no ambiguity. However, whenever necessary, a const char[9] will decay into a const char* that points to its first element. And std::string has an implicit constructor that accepts a const char*. So while it always starts as an array of char, it can become the other types if you need it to.

Note that string literals have the unique property that const char[N] can also decay into char*, but this behavior is deprecated. If you try to modify the underlying string this way, you end up with undefined behavior. Its just not a good idea.



回答3:

std::string file = "file.txt"; 

The right hand side of the = contains a (raw) string literal (i.a. a null-terminated byte string). Its effective type is array of const char.

The = is a tricky pony here: No assignment happens. The std::string class has a constructor that takes a pointer to char as an argument and this is called to create a temporary std::string and this is used to copy-construct (using the copy ctor of std::string) the object file of type std::string.

The compiler is free to elide the copy ctor and directly instantiate file though.

However, note that std:string is not the same thing as a C-style null-terminated string. It is not even required to be null-terminated.

ifstream inf("file.txt");

The std::ifstream class has a ctor that takes a const char * and the string literal passed to it decays to a pointer to the first element of the string.

The thing to remember is this: std::string provides (almost seamless) conversion from C-style strings. You have to look up the signature of the function to see if you are passing in a const char * or a std::string (the latter because of implicit conversions).



回答4:

So, is "string" a std::string, a c-string or a char*, or does it depend on the context?

It depends entirely on the context. :-) Welcome to C++.

A C string is a null-terminated string, which is almost always the same thing as a char*.

Depending on the platforms and frameworks you are using, there might be even more meanings of the word "string" (for example, it is also used to refer to QString in Qt or CString in MFC).



回答5:

The C++ standard library provides a std::string class to manage and represent character sequences. It encapsulates the memory management and is most of the time implemented as a C-string; but that is an implementation detail. It also provides manipulation routines for common tasks.

The std::string type will always be that (it doesn't have a conversion operator to char* for example, that's why you have the c_str() method), but it can be initialized or assigned to by a C-string (char*).

On the other hand, if you have a function that takes a std::string or a const std::string& as a parameter, you can pass a c-string (char*) to that function and the compiler will construct a std::string in-place for you. That would be a differing interpretation according to context as you put it.



回答6:

Neither C nor C++ have a built-in string data type.

When the compiler finds, during the compilation, a double-quoted strings is implicitly referred (see the code below), the string itself is stored in program code/text and generates code to create even character array:

  • The array is created in static storage because it must persist to be referred later.
  • The array is made to constant because it must always contain the original data (Hello).

So at last, what you have is const char * to this constant static character array.

const char* v()
{
    char* text = “Hello”;
    return text;
    // Above code can be reduced to:
    // return “Hello”;
}

During the program run, when the control finds opening bracket, it creates “text”, the char* pointer, in the stack and constant array of 6 elements (including the null terminator ‘\0’ at the end) in static memory area. When control finds next line (char* text = “Hello”;), the starting address of the 6 element array is assigned to “text”. In next line (return text;), it returns “text”. With the closing bracket “text” will disappear from the stack, but array is still in the static memory area.

You need not to make return type const. But if you try to change the value in static array using non constant char* it will still give you an error during the run time because the array is constant. So, it’s always good to make return constant to make sure, it cannot be referred by non constant pointer.

But if the compiler finds a double-quoted strings is explicitly referred as an array, the compiler assumes that the programmer is going to (smartly) handle it. See the following wrong example:

const char* v()
{
    char text[] = “Hello”;
    return text;
}

During the compilation, compiler checks, quoted text and save it as it is in the code to fill the generated array during the runt time. Also, it calculate the array size, in this case again as 6.

During the program run, with the open bracket, the array “text[]” with 6 elements is created in stack. But no initialization. When the code finds (char text[] = “Hello”;), the array is initialized (with the text in compiled code). So array is now on the stack. When the compiler finds (return text;), it returns the starting address of the array “text”. When the compiler find the closing bracket, the array disappears from the stack. So no way to refer it by the return pointer.

Most standard library functions still take only char * (or const char *) parameters.

The Standard C++ library has a powerful class called string for manipulating text. The internal data structure for string is character arrays. The Standard C++ string class is designed to take care of (and hide) all the low-level manipulations of character arrays that were previously required of the C programmer. Note that std::string is a class:

  • You can implicitly convert a char * into std::string because the latter has a constructor to do that.
  • You can explicitly convert a std::string into a const char * by using the c_str() method.


回答7:

As often as possible it should mean std::string (or an alternative such as wxString, QString, etc., if you're using a framework that supplies such. Sometimes you have no real choice but to use a NUL-terminated byte sequence, but you generally want to avoid it when possible.

Ultimately, there simply is no clear, unambiguous terminology. Such is life.



回答8:

To use the proper wording (as found in the C++ language standard) string is one of the varieties of std::basic_string (including std::string) from chapter 21.3 "String classes" (as in C++0x N3092), while the argument of ifstream's constructor is NTBS (Null-terminated byte sequence)

To quote, C++0x N3092 27.9.1.4/2.

basic_filebuf* open(const char* s, ios_base::openmode mode);

...

opens a file, if possible, whose name is the NTBS s



标签: c++ string char