Performance impact of realloc ()

2019-04-03 15:28发布

问题:

I have a list of records, at the starting I dont know the number of records. I need to read them into array. so is it advisable to read all record one by one & doing realloc one by one & go on increasing the array size as the element comes OR should I spend one pass in identifying the number of records & do malloc only once ? Which one will be be less computationally expensive ?

回答1:

A realloc isn't really very expensive. But calling realloc for each element is a bit much. I suggest you do this:

  • Start with a size
  • When you add an element, check if you have enough space
  • When you don't have enough space, double the currrent amount

Correctly guessing an adequate initial size also helps. So if 60% of your inputs are less than 100 records, start with that.



回答2:

It is a common technique to double the size of the array whenever it becomes full, as others have noted. In fact, using this technique ensures that no more than a constant amount of time is spent per element, as explained on WikiPedia.

Depending on how fast the code needs to be and what source you are reading from, it might be a good idea to compute the size of the output in a separate pass. If you are reading from disk you should probably use a dynamic array but otherwise you should probably do whatever is easier to implement.



回答3:

You should not realloc( ) one-by-one.

The optimal strategy for this sort of thing depends on the specifics of what you're doing, but one common, simple, and nearly optimal approach is to increase your allocation size by a factor of two whenever you run out of room in your array.



回答4:

You could also consider using a linked list instead of an array for your task if it is possible. It will require only malloc to add new elements.