I have got here two programs with me, both are doing exactly the same task. They are just setting an boolean array / vector to the value true. The program using vector takes 27 seconds to run whereas the program involving array with 5 times greater size takes less than 1 s. I would like to know the exact reason as to why there is such a major difference ? Are vectors really that inefficient ?
Program using vectors
#include <iostream>
#include <vector>
#include <ctime>
using namespace std;
int main(){
const int size = 2000;
time_t start, end;
time(&start);
vector<bool> v(size);
for(int i = 0; i < size; i++){
for(int j = 0; j < size; j++){
v[i] = true;
}
}
time(&end);
cout<<difftime(end, start)<<" seconds."<<endl;
}
Runtime - 27 seconds
Program using Array
#include <iostream>
#include <ctime>
using namespace std;
int main(){
const int size = 10000; // 5 times more size
time_t start, end;
time(&start);
bool v[size];
for(int i = 0; i < size; i++){
for(int j = 0; j < size; j++){
v[i] = true;
}
}
time(&end);
cout<<difftime(end, start)<<" seconds."<<endl;
}
Runtime - < 1 seconds
Platform - Visual Studio 2008 OS - Windows Vista 32 bit SP 1 Processor Intel(R) Pentium(R) Dual CPU T2370 @ 1.73GHz Memory (RAM) 1.00 GB
Thanks
Amare
You're using std::vector of bool and that's not what you think it is!
vector of bool is a bastard child template specialization that should never have existed and actually stores 1 bool in each bit. Accesses into it are more complicated due to masking and shifting logic, so will definitely be somewhat slower.
Click here for some info on vector of bool.
Also, you may be running an unoptimized build (almost certainly given the times you listed, 27 seconds is outrageous for 4 million iterations). The Standard Template Library relies very heavily on the optimizer to do things like inline function calls and elide temporaries. Lack of this optimization causes an especially heavy performance degradation for vector of bool because it has to return a proxy object when you index into it, because you can't take the address of a bit, so operator [] can't return a reference.
Click here for more info on proxied containers (The last half is about vector of bool)
In addition many STL implementations have helpful debugging bits that are not part of the standard which help you catch bugs, but really drag performance down. You'll want to make sure those are disabled in your optimized build.
Once you turn on the optimizer, have the right settings (i.e. no STL debugging turned on), and are actually testing the same thing in both loops, you will see almost no difference.
I had to make your loop much larger to test on my machine, but here's two builds of your vector of bool loop on my machine, showing the impact of optimizer flags on STL code
Have a look at this
Using arrays or std::vectors in C++, what’s the performance gap?
std::vector<bool>
is optimized for memory consumption rather than performance.You may trick it by using
std::vector<int>
. You should not have performance drawbacks then.Other answers are very good, but you could easily answer it yourself by this method.
ADDED: In response to comments, let me show you what I mean. I'm running VC On Windows, but this works on any language/OS. I took your first program and increased size to 20000 so it would run long enough. Then while it was running, I took several stackshots. They all look like this:
So what that says is that it is spending essentially all of it's time in the indexing operation on line 24, and the reason it's spending that time is that the
[]
operator is calling thebegin
operator. In more detail:is this code:
that calls:
which is this code (that I reformatted a little bit):
that calls:
which is doing this (in more detail):
What it is doing is writing 68 bytes of
0xCC
on the stack (for some debug reason) as part of getting thebegin
address of the vector, as part of computing the address ofv[i]
, prior to doing the assignment.The fraction of time it spends doing this is near 100%, because it was doing it on every one of the several samples that were taken. Could you have guessed that that's what it was spending nearly all of its time doing? I couldn't have.
This is, of course, a Debug build. If you switch to a Release build, but turn on debugging information, all these functions get inlined and optimized away, so it goes like 30 times faster, and again the stackshots tell exactly what it's doing.
So - people can tell you what it might be doing, but this shows how to find out for yourself what it is really doing.
On your environment it will undoubtedly be different.