Memory Allocation/Deallocation Bottleneck?

How much of a bottleneck is memory allocation/deallocation in typical real-world programs? Answers from any type of program where performance typically matters are welcome. Are decent implementations of malloc/free/garbage collection fast enough that it's only a bottleneck in a few corner cases, or would most performance-critical software benefit significantly from trying to keep the amount of memory allocations down or having a faster malloc/free/garbage collection implementation?

Note: I'm not talking about real-time stuff here. By performance-critical, I mean stuff where throughput matters, but latency doesn't necessarily.

Edit: Although I mention malloc, this question is not intended to be C/C++ specific.

标签： performance optimization memory-management garbage-collection malloc

12条回答

相关推荐>>

2楼-- · 2020-01-27 11:40

Others have covered C/C++ so I'll just add a little information on .NET.

In .NET heap allocation is generally really fast, as it it just a matter of just grabbing the memory in the generation zero part of the heap. Obviously this cannot go on forever, which is where garbage collection comes in. Garbage collection may affect the performance of your application significantly since user threads must be suspended during compaction of memory. The fewer full collects, the better.

There are various things you can do to affect the workload of the garbage collector in .NET. Generally if you have a lot of memory reference the garbage collector will have to do more work. E.g. by implementing a graph using an adjacency matrix instead of references between nodes the garbage collector will have to analyze fewer references.

Whether that is actually significant in your application or not depends on several factors and you should profile the application with actual data before turning to such optimizations.

0人赞添加讨论(0) 举报

Summer. ? 凉城

3楼-- · 2020-01-27 11:46

This is where c/c++'s memory allocation system works the best. The default allocation strategy is OK for most cases but it can be changed to suit whatever is needed. In GC systems there's not a lot you can do to change allocation strategies. Of course, there is a price to pay, and that's the need to track allocations and free them correctly. C++ takes this further and the allocation strategy can be specified per class using the new operator:

class AClass
{
public:
  void *operator new (size_t size); // this will be called whenever there's a new AClass
   void *operator new [] (size_t size); // this will be called whenever there's a new AClass []
  void operator delete (void *memory); // if you define new, you really need to define delete as well
  void operator delete [] (void *memory);define delete as well
};

Many of the STL templates allow you to define custom allocators as well.

As with all things to do with optimisation, you must first determine, through run time analysis, if memory allocation really is the bottleneck before writing your own allocators.

0人赞添加讨论(0) 举报

【Aperson】

4楼-- · 2020-01-27 11:47

A Java VM will claim and release memory from the operating system pretty much indepdently of what the application code is doing. This allows it to grab and release memory in large chunks, which is hugely more efficient than doing it in tiny individual operations, as you get with manual memory management.

This article was written in 2005, and JVM-style memory management was already streets ahead. The situation has only improved since then.

Which language boasts faster raw allocation performance, the Java language, or C/C++? The answer may surprise you -- allocation in modern JVMs is far faster than the best performing malloc implementations. The common code path for new Object() in HotSpot 1.4.2 and later is approximately 10 machine instructions (data provided by Sun; see Resources), whereas the best performing malloc implementations in C require on average between 60 and 100 instructions per call (Detlefs, et. al.; see Resources). And allocation performance is not a trivial component of overall performance -- benchmarks show that many real-world C and C++ programs, such as Perl and Ghostscript, spend 20 to 30 percent of their total execution time in malloc and free -- far more than the allocation and garbage collection overhead of a healthy Java application.

0人赞添加讨论(0) 举报

SAY GOODBYE

5楼-- · 2020-01-27 11:48

Allocating and releasing memory in terms of performance are relatively costly operations. The calls in modern operating systems have to go all the way down to the kernel so that the operating system is able to deal with virtual memory, paging/mapping, execution protection etc.

On the other side, almost all modern programming languages hide these operations behind "allocators" which work with pre-allocated buffers.

This concept is also used by most applications which have a focus on throughput.

0人赞添加讨论(0) 举报

你好瞎i

6楼-- · 2020-01-27 11:52

It's significant, especially as fragmentation grows and the allocator has to hunt harder across larger heaps for the contiguous regions you request. Most performance-sensitive applications typically write their own fixed-size block allocators (eg, they ask the OS for memory 16MB at a time and then parcel it out in fixed blocks of 4kb, 16kb, etc) to avoid this issue.

In games I've seen calls to malloc()/free() consume as much as 15% of the CPU (in poorly written products), or with carefully written and optimized block allocators, as little as 5%. Given that a game has to have a consistent throughput of sixty hertz, having it stall for 500ms while a garbage collector runs occasionally isn't practical.

0人赞添加讨论(0) 举报

Memory Allocation/Deallocation Bottleneck?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间