C++ allocates abnormally large amout memory for va

2019-01-18 23:02发布

I recently got to know an integer takes 4 bytes from the memory.

First ran this code, and measured the memory usage:

int main()
{
   int *pointer;
}

enter image description here

  • It took 144KB.

Then I modified the code to allocate 1000 integer variables.

int main()
{
   int *pointer;

   for (int n=0; n < 1000; n++)
     { 
       pointer = new int ; 
     }
}

enter image description here

  • Then it took (168-144=) 24KB
    but 1000 integers are suppose to occupy (4bytes x 1000=) 3.9KB.

Then I decided to make 262,144 integer variables which should consume 1MB of memory.

int main()
{
   int *pointer;

   for (int n=0; n < 262144; n++)
     { 
       pointer = new int ; 
     }
}

Surprisingly, now it takes 8MB

enter image description here

Memory usage, exponentially grows respective to the number of integers.
Why is this happening?

I'm on Kubuntu 13.04 (amd64)
Please give me a little explanation. Thanks!

NOTE: sizeof(integer) returns 4

9条回答
混吃等死
2楼-- · 2019-01-18 23:33

You are allocating multiple dynamic variables. Each variable contain 4 bytes of data, but a memory manager usually stores some additional information about allocated blocks, and each block should be aligned, that creates additional overhead.

Try pointer = new int[262144]; and see the difference.

查看更多
Ridiculous、
3楼-- · 2019-01-18 23:33

You are allocating more than just an int, you are also allocating a heap block which has overhead (which varies by platform). Something needs to keep track of the heap information. If you instead allocated an array of ints, you'd see memory usage more in line with your expectations.

查看更多
Anthone
4楼-- · 2019-01-18 23:37

In addition to the alignment and overhead issues mentioned in the other questions, this may be due to the way the C++ runtime requests process memory allocations from the OS.

When the process's data section fills up, the runtime has to get more memory allocated to the process. It might not do this in equal-sized chunks. A possible strategy is that each time it requests memory, it increases the amount that it request (maybe doubling the heap size each time). This strategy keeps the memory allocation small for programs that don't use much memory, but reduces the number of times that a large application has to request new allocations.

Try running your program under strace and look for calls to brk, and note how large the request is each time.

查看更多
姐就是有狂的资本
5楼-- · 2019-01-18 23:47

Each declaration makes a new variable suitable for the alignment options of the compiler which needs spaces between(starting address of a variable should be a multiple of 128 or 64 or 32 (bits) and this causes the inefficiency for memory of many variables vs one array). Array is much more useful to have the contiguous area.

查看更多
时光不老,我们不散
6楼-- · 2019-01-18 23:51

Non-optimized allocations in common allocators go with some overhead. You can think of two "blocks": An INFO and a STORAGE block. The Info block will most likely be right in front of your STORAGE block.

So if you allocate you'll have something like that in your memory:

        Memory that is actually accessible
        vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
--------------------------------------------
|  INFO |  STORAGE                         |
--------------------------------------------
^^^^^^^^^
Some informations on the size of the "STORAGE" chunk etc.

Additionally the block will be aligned along a certain granularity (somewhat like 16 bytes in case of int).

I'll write about how this looks like on MSVC12, since I can test on that in the moment.

Let's have a look at our memory. The arrows indicate 16 byte boundaries.

Memory layout in case of int allocation; each block represents 4 bytes

If you allocate a single 4 byte integer, you'll get 4 bytes of memory at a certain 16 bytes boundary (the orange square after the second boundary). The 16 bytes prepending this block (the blue ones) are occupied to store additional information. (I'll skip things like endianess etc. here but keep in mind that this can affect this sort of layout.) If you read the first four bytes of this 16byte block in front of your allocated memory you'll find the number of allocated bytes.

If you now allocate a second 4 byte integer (green box), it's position will be at least 2x the 16 byte boundary away since the INFO block (yellow/red) must fit in front of it which is not the case at the rightnext boundary. The red block is again the one that contains the number of bytes.

As you can easily see: If the green block would have been 16 bytes earlier, the red and the orange block would overlap - impossible.

You can check that for yourself. I am using MSVC 2012 and this worked for me:

char * mem = new char[4096];
cout << "Number of allocated bytes for mem is: " << *(unsigned int*)(mem-16) << endl;
delete [] mem;


double * dmem = new double[4096];
cout << "Number of allocated bytes for dmem is: " << *(unsigned int*)(((char*)dmem)-16) << endl;
delete [] dmem;

prints

Number of allocated bytes for mem is: 4096
Number of allocated bytes for dmem is: 32768

And that is perfectly correct. Therefore a memory allocation using new has in case of MSVC12 an additional "INFO" block which is at least 16 bytes in size.

查看更多
一夜七次
7楼-- · 2019-01-18 23:52

Each time you allocate an int:

  • The memory allocator must give you a 16-byte aligned block of space, because, in general, memory allocation must provide suitable alignment so that the memory can be used for any purpose. Because of this, each allocation typically returns at least 16 bytes, even if less was requested. (The alignment requires may vary from system to to system. And it is conceivable that small allocations could be optimized to use less space. However, experienced programmers know to avoid making many small allocations.)
  • The memory allocator must use some memory to remember how much space was allocated, so that it knows how much there is when free is called. (This might be optimized by combining knowledge of the space used with the delete operator. However, the general memory allocation routines are typically separate from the compiler’s new and delete code.)
  • The memory allocator must use some memory for data structures to organize information about blocks of memory that are allocated and freed. Perhaps these data structures require O(n•log n) space, so that overhead grows when there are many small allocations.

Another possible effect is that, as memory use grows, the allocator requests and initializes larger chunks from the system. Perhaps the first time you use the initial pool of memory, the allocator requests 16 more pages. The next time, it requests 32. The next time, 64. We do not know how much of the memory that the memory allocator has requested from the system has actually been used to satisfy your requests for int objects.

Do not dynamically allocate many small objects. Use an array instead.

查看更多
登录 后发表回答