Can we rely on the reduce-capacity trick?

2020-01-25 11:25发布

问题:

Is it actually guaranteed anywhere that the following reduce-capacity trick will "work"?

int main() {
   std::string s = "lololololol";
   s = "";                        // capacity still non-zero

   string(s).swap(s);             // ?
}

It doesn't seem to "work" for me (in that the capacity remains non-zero), and I can't find anything in the standard that says anything more than that the "contents" must be swapped between the two [here, identical] objects.

Similarly, for sequence containers:

int main() {
   vector<int> v { 1,2,3,4,5 };
   v.clear();                   // capacity still non-zero

   vector<int>(v).swap(v);      // ?
}

As far as I'm aware, this "trick" is semi-widely used; perhaps this widespread adoption is misguided?

(Of course, in C++11 we have shrink_to_fit [albeit non-binding] instead, making this kind of moot.)

回答1:

I've always been taught that there is no guaranteed standard way to lower the capacity. All methods have been (and still are) implementation defined.

§ 23.2.1\8 says:

The expression a.swap(b), for containers a and b of a standard container type other than array, shall exchange the values of a and b without invoking any move, copy, or swap operations on the individual container elements...

This guarantees that the internal pointers of vectors must be swapped.
However, I cannot find anything that guarantee on the capacity of a newly created vector.

§ 21.4.2\1 says that one of the basic_string default constructor's post conditions is that capacity() returns an unspecified value.
§ 21.4.2\3 says that one of the basic_string copy constructor's post conditions is that capacity() returns a value at least as big as size().
§ 21.4.6.8\2 says that string::swap runs in constant time, which (effectively) requires that the internal pointers are swapped.

As far as I can tell, a conforming implementation could have string::max_size() { return 4;}, and swapping all internals from one buffer to another would therefore be constant time. (vector can't do that though)

Obviously, take this all with a grain of salt. I'm quoting from the C++ draft from Feb28,'11, and I can't find specifications for the vector's copy constructor. Also, not finding evidence for is not the same as finding evidence against.



回答2:

On his errata page for "Effective STL," Scott Meyers notes:

When string implementations use reference counting, the swap trick using the copy constructor doesn't decrease the capacity, because the copy constructor isn't allocating any memory; it's just adjusting a reference count. A more reliable way to perform shrink-to-fit is to create the temporary string via the range constructor, e.g., string(s.begin(), s.end()).swap(s); This version of the swap trick is safer for vectors, too, because it eliminates any chance that the copy constructor will copy the other vector's excess capacity (which implementations are permitted to do).

As for the 'guarantee,' Meyers notes:

The language police require that I inform you that there's no guarantee that this technique will truly eliminate excess capacity. Implementers are free to give vectors and strings excess capacity if they want to, and sometimes they want to. [Effective STL, Item 17]



回答3:

From http://www.gotw.ca/gotw/054.htm:

Some implementations may choose to round up the capacity slightly to their next larger internal "chunk size," with the result that the capacity actually ends up being slightly larger than the size.



回答4:

How this works is probably entirely implementation-defined. Unlike containers such as vector, strings can have very different implementations.

If the string implementation uses small string optimization, then you can't lower capacity beyond a certain threshold. If the string implementation uses copy-on-write, then no write occurs and no real copy is made.

According to http://www.gotw.ca/gotw/054.htm, shrink-to-fit and clear-completely are different tricks. If the intention is to clear completely, then swapping with a default-constructed string can be expected to give better results.



标签: c++ std