One large malloc versus multiple smaller reallocs

2019-08-11 08:49发布

问题:

Sorry if this has been asked before, I haven't been able to find just what I am looking for.

I am reading fields from a list and writing them to a block of memory. I could

  • Walk the whole list, find the total needed size, do one malloc and THEN walk the list again and copy each field;
  • Walk the whole list and realloc the block of memory as I write the values;

Right now the first seems the most efficient to me (smallest number of calls). What are the pros and cons of either approach ?

Thank you for your time.

回答1:

You're probably better off allocating a decent amount of space initially, based on what you think is the most likely maximum.

Then, if you find you need more space, don't just allocate enough for the extra, allocate a big chunk extra.

This will minimise the number of re-allocations while still only processing the list once.

By way of example, initially allocate 100K. If you then find you need more, re-allocate to 200K, even if you only need 101K.



回答2:

The first approach is almost always better. A realloc() typically works by copying the entire contents of a memory block into the freshly allocated, larger block. So n reallocs can mean n copies, each one larger than the last. (If you are adding m bytes to your allocation each time, then the first realloc has to copy m bytes, the next one 2m, the next 3m, ...).

The pedantic answer would be that the internal performance implications of realloc() are implementation specific, not explicitly defined by the standard, in some implementation it could work by magic fairies that move the bytes around instantly, etc etc etc - but in any realistic implementation, realloc() means a copy.



回答3:

Don't reinvent the wheel and use CCAN's darray which implements an approach similar to what paxdiablo described. See darray on GitHub