I need to store a dynamic array of bits.
The C++ reference page on vector< bool > has the following information:
The storage is not necessarily an array of bool
values, but the library implementation may optimize storage so that each value is stored in a single bit.
How do I make sure that my program that uses vector<bool>
does in fact store bits in a vector instead of boolean values (bytes)?
Don't try to do that. Instead, use boost::dynamic_bitset
which clearly indicates what you actually want. The vector<bool>
optimization actually creates a number of possibilities for bugs, for example when using iterators (because it usually returns a proxy object).
Well, you can always look into the header files that come with your compiler. Since STL containers are almost exclusively template classes, most if not all parts of the implementation will be visible in the headers.
Maybe looking at a vector
object in a debugger can also be helpful.
Note: You should also be aware that vector<bool>
is meanwhile rather frowned upon by the C++ community, and that this optimization is for size, not for speed:
https://www.informit.com/guides/content.aspx?g=cplusplus&seqNum=98
It may be possible to check this at compile time, by checking the return type of the non-const version of vector<bool>::operator[]
: An implementation that stores its values as bits will have to return a proxy reference class rather than a bool&
.
There's really nothing to check here at all. The specialization of vector<bool>
to store bits instead of larger objects is required by the standard. §23.2.5: "To optimize space allocation, a specialization of vector for bool elements is provided:".
I suppose from some viewpoint, what you've quoted is at least sort of correct. Since there's essentially nobody to certify conformance of a compiler, and essentially no compiler that even attempts to fulfill all conformance requirements, a compiler could choose to ignore this requirement as well.
I don't know of any compiler that does so though -- and if anybody did, I'd guess it would probably be pretty well known. There have been pretty heated discussions at times about removing the vector<bool>
specialization, so if somebody had real-life examples of how much better (or worse) that made things, I suspect we'd have heard about it.
Edit: In C++11, the requirements for std::vector<bool>
have been moved to §23.3.7. More importantly, the wording has been changed, to specify that a packed representation where each bool
is stored as a single bit instead of a contiguous allocation of bool
values is now only a recommendation.
At least IMO, this makes little real difference. As far as I know, all real implementations still use a packed representation, so even though packed storage is no longer theoretically guaranteed, it happens in practice anyway.
This program kind of prooves it.
#include <vector>
#include <iostream>
template <typename T>
void showSize() {
std::vector<T> myvec;
size_t capacity = myvec.capacity();
std::cout << "capacity: " << myvec.capacity() << std::endl;
std::cout << "size: " << myvec.size() << std::endl;
while (myvec.capacity() < 1024) {
while (myvec.capacity() == capacity) {
myvec.push_back(T());
}
capacity = myvec.capacity();
std::cout << "capacity: " << myvec.capacity() << std::endl;
std::cout << "size: " << myvec.size() << std::endl;
}
}
int main(int, char**) {
std::cout << std::endl << std::endl;
std::cout << "*********************" << std::endl << std::endl;
std::cout << "Booleans: " << std::endl << std::endl;
showSize<bool>();
std::cout << std::endl << std::endl;
std::cout << "*********************" << std::endl << std::endl;
std::cout << "Chars: " << std::endl << std::endl;
showSize<char>();
}
output:
*********************
Booleans:
capacity: 0
size: 0
capacity: 64
size: 1
capacity: 128
size: 65
capacity: 256
size: 129
capacity: 512
size: 257
capacity: 1024
size: 513
*********************
Chars:
capacity: 0
size: 0
capacity: 1
size: 1
capacity: 2
size: 2
capacity: 4
size: 3
capacity: 8
size: 5
capacity: 16
size: 9
capacity: 32
size: 17
capacity: 64
size: 33
capacity: 128
size: 65
capacity: 256
size: 129
capacity: 512
size: 257
capacity: 1024
size: 513
So the key is that the capacity for bools increases 64 entries at a time (size of int or my machine). This hints that it's just reserving just 8 bytes at a time.
Create a huge vector<bool>
and look at the memory usage of the program.
Or simply check out the source code - you can peek at the vector
header.