In an ACM example, I had to build a big table for dynamic programming. I had to store two integers in each cell, so I decided to go for a std::pair<int, int>
. However, allocating a huge array of them took 1.5 seconds:
std::pair<int, int> table[1001][1001];
Afterwards, I have changed this code to
struct Cell {
int first;
int second;
}
Cell table[1001][1001];
and the allocation took 0 seconds.
What explains this huge difference in time?
My guess it's the way std::pair is created. There is more overhead during invoking a pair constructor 1001x1001 times than when you just allocate a memory range.
These are all very good guesses, but as everyone knows, guesses are not reliable.
I would say just randomly pause it within that 1.5 seconds, but you'd have to be pretty quick. If you increased each dimension by a factor of about 3, you could make it take more like 10+ seconds, so it would be easier to pause.
Or, you could get it under a debugger, break it in the pair constructor code, and then single step to see what it is doing.
Either way, you would get a firm answer to the question, not just a guess.
The answers so far don't explain the full magnitude of the problem.
As sharptooth has pointed out, the pair solution initializes the values to zero. As Lemurik pointed out, the pair solution isn't just initializing a contiguous block of memory, instead it is calling the pair constructor for every element in the table. However, even that doesn't account for it taking 1.5 seconds. Something else is happening.
Here's my logic:
Assuming you were on an ancient machine, say running at 1.33ghz, then 1.5 seconds is 2e9 clock cycles. You've got 2e6 pairs to construct, so somehow each pair constructor is taking 1000 cycles. It doesn't take 1000 cycles to call a constructor that just sets two integers to zero. I can't see how cache misses would make it take that long. I would believe it if the number was less than 100 cycles.
I thought it would be interesting to see where else all these CPU cycles are going. I used the crappiest oldest C++ compiler I could find to see if I could attain the level of wastage required. That compiler was VC++ v6. In debug mode, it does something I don't understand. It has a big loop that calls the pair constructor for each item in the table - fair enough. That constructor sets the two values to zero - fair enough. But just before doing that, it sets all the bytes in a 68 byte region to 0xcc. That region is just before the start of the the big table. It then overwrites the last element of that region with 0x28F61200. Every call of the pair constructor repeats this. Presumably this is some kind of book keeping by the compiler so it knows which regions are initialized when checking for pointer errors at run time. I'd love to know exactly what this is for.
Anyway, that would explain where the extra time is going. Obviously another compiler may not be this bad. And certainly an optimized release build wouldn't be.
std::pair<int, int>::pair()
constructor initializes the fields with default values (zero in case ofint
) and yourstruct Cell
doesn't (since you only have an auto-generated default constructor that does nothing).Initializing requires writing to each field which requires a whole lot of memory accesses that are relatively time consuming. With
struct Cell
nothing is done instead and doing nothing is a bit faster.it is really a good example about one should write C++ and use STL carefully. any thoughts?
my project is working on a C&C++ code level benchmark test tool in which we'll make lots of code samples to found out what is "good" code and what is a "bad" coding habit. see http://effodevel.googlecode.com to learn more about C9B.M. planning. anyone if you had experienced lots of such cases, please kindly join the project to help us.