STL standard do not require from std::string to be refcounted. But in fact most of C++ implementations provide refcounted, copy-on-write strings, allowing you passing string by value as a primitive type. Also these implementations (at least g++) use atomic operations making these string lock-free and thread safe.
Easy test shows copy-on-write semantics:
#include <iostream>
#include <string>
using namespace std;
void foo(string s)
{
cout<<(void*)s.c_str()<<endl;
string ss=s;
cout<<(void*)ss.c_str()<<endl;
char p=ss[0];
cout<<(void*)ss.c_str()<<endl;
}
int main()
{
string s="coocko";
cout<<(void*)s.c_str()<<endl;
foo(s);
cout<<(void*)s.c_str()<<endl;
}
Only two adresses are printed exactly after a non-constant member was used.
I tested this code using HP, GCC and Intel compiler and got similar results -- strings work as copy-on-write containers.
On the other hand, VC++ 2005 shows clearly that each string is fully copied.
Why?
I know that there was a bug in VC++6.0 that had non-thread-safe implementation of reference counting that caused random program craches. Is this the reason? They just afraid to use ref-counting any more even it is common practice? They prefer to not use ref-counting at all over fixing the issue?
Thanks
It is not main reason, but I saw a lot of incorrect code under win32 platform which do something like
const_cast< char* >( str.c_str() )
.Maybe Microsoft know this and takes care about developers :)
Maybe Microsoft determined that string copying was not a big issue, as almost all C++ code uses pass by reference wherever possible. Maintaining a reference count has an overhead in space and time (ignoring locking) that perhaps they decided was not worth paying.
Or maybe not. If this is of concern for you, you should profile your application to determine if string copying is a major overhead, and if it is switch to a different string implementation.
The STL actual requires that if you use reference counting that the semantics are the same as for a non reference counted version. This is not trivial for the general case.(Which is why you should not write your on string class).
Because of the following situation:
See: http://www.sgi.com/tech/stl/string_discussion.html for more details
I think that more and more
std::string
implementations will move away from refcounting/copy-on-write as it is often a counter-optimization in multi-threaded code.See Herb Sutter's article Optimizations That Aren't (In a Multithreaded World).
As stated by Martin & Michael, Copy On Write (COW) is often more trouble than it's worth, for further reading see this excellent article by Kelvin Henney about Mad COW Disease and I believe it was Andrei Alexandrescu that stated that Small String Optimization performs better in many applications (but I can't find the article).
Small String Optimization is where you make the string object bigger and avoid heap allocations for small strings. A toy implementation will look something like this: