Removing elements from dynamic arrays

2019-05-06 19:17发布

问题:

So, I have this:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void remove_element(int* array, int sizeOfArray, int indexToRemove)
{
    int* temp = malloc((sizeOfArray - 1) * sizeof(int*)); // allocate an array with a size 1 less than the current one
    memcpy(temp, array, indexToRemove - 1); // copy everything BEFORE the index
    memcpy(temp+(indexToRemove * sizeof(int*)), temp+((indexToRemove+1) * sizeof(int*)), sizeOfArray - indexToRemove); // copy everything AFTER the index
    free (array);
    array = temp;
}

int main()
{
    int howMany = 20;
    int* test = malloc(howMany * sizeof(int*));


    for (int i = 0; i < howMany; ++i)
        (test[i]) = i;



    printf("%d\n", test[16]);
    remove_element(test, howMany, 16);
    --howMany;
    printf("%d\n", test[16]);
    return 0;
}

It's reasonably self-explanatory, remove_element removes a given element of a dynamic array.

As you can see, each element of test is initialised to an incrementing integer (that is, test[n] == n). However, the program outputs

16
16

. Having removed an element of test, one would expect a call to to test[n] where n >= the removed element would result in what test[n+1] would have been before the removal. So I would expect the output

16
17

. What's going wrong?

EDIT: The problem has now been solved. Here's the fixed code (with crude debug printfs), should anyone else find it useful:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int remove_element(int** array, int sizeOfArray, int indexToRemove)
{
    printf("Beginning processing. Array is currently: ");
    for (int i = 0; i < sizeOfArray; ++i)
        printf("%d ", (*array)[i]);
    printf("\n");

    int* temp = malloc((sizeOfArray - 1) * sizeof(int)); // allocate an array with a size 1 less than the current one

    memmove(
            temp,
            *array,
            (indexToRemove+1)*sizeof(int)); // copy everything BEFORE the index

    memmove(
            temp+indexToRemove,
            (*array)+(indexToRemove+1),
            (sizeOfArray - indexToRemove)*sizeof(int)); // copy everything AFTER the index


    printf("Processing done. Array is currently: ");
    for (int i = 0; i < sizeOfArray - 1; ++i)
        printf("%d ", (temp)[i]);
    printf("\n");

    free (*array);
    *array = temp;
    return 0;

}

int main()
{
    int howMany = 20;
    int* test = malloc(howMany * sizeof(int*));


    for (int i = 0; i < howMany; ++i)
        (test[i]) = i;



    printf("%d\n", test[16]);
    remove_element(&test, howMany, 14);
    --howMany;
    printf("%d\n", test[16]);
    return 0;
}

回答1:

I see several issues in the posted code, each of which could cause problems:

returning the new array

Your function is taking an int* array but then you are trying to swap it with your temp variable at the end prior to returning the new array. This will not work, as you are simply replacing the local copy of int* array which will disappear after you return from the function.

You either need to pass your array pointer in as an int**, which would allow you to set the actual pointer to the array in the function, or, I would suggest just returning a value of int* for your function, and returning the new array.

Also, as mentioned in this answer, you really don't even need to reallocate when deleting an element from the array, since the original array is big enough to hold everything.

size and offset calculations

  1. You are using sizeof(int*) for calculating the array element size. This may work for some types, but, for instance, for a short array sizeof(short*) does not work. You don't want the size of the pointer to the array, you want the size of the elements, which for your example should be sizeof(int) although it may not cause problems in this case.

  2. Your length calculation for the offsets into the arrays looks ok, but you're forgetting to multiply the number of elements by the element size for the size parameter of the memcpy. e.g. memcpy(temp, array, indexToRemove * sizeof(int));.

  3. Your second call to memcpy is using temp plus the offset as the source array, but it should be array plus the offset.

  4. Your second call to memcpy is using sizeOfArray - indexToRemove for the number of elements to copy, but you should only copy SizeOfArray - indexToRemove - 1 elements (or (sizeOfArray - indexToRemove - 1) * sizeof(int) bytes

  5. Wherever you are calculating offsets into the temp and array arrays, you don't need to multiply by sizeof(int), since pointer arithmetic already takes into account the size of the elements. (I missed this at first, thanks to: this answer.)

looking at incorrect element

You are printing test[16] (the 17th element) for testing, but you are removing the 16th element, which would be test[15].

corner cases

Also (thanks to this answer) you should handle the cases where indexToRemove == 0 and indexToRemove == (sizeOfArray - 1), where you can do the entire removal in one memcpy.

Also, you need to worry about the case where sizeOfArray == 1. In that case perhaps either allocate a 0 size block of memory, or return null. In my updated code, I chose to allocate a 0-size block, just to differentiate between an array with 0 elements vs. an unallocated array.

Returning a 0-size array also means there are no additional changes necessary to the code, because the conditions before each memcpy to handle the first two cases mentioned will prevent either memcpy from taking place.

And just to mention, there's no error handling in the code, so there are implicit preconditions that indexToRemove is in bounds, that array is not null, and that array has the size passed as sizeOfArray.

example updated code

int* remove_element(int* array, int sizeOfArray, int indexToRemove)
{
    int* temp = malloc((sizeOfArray - 1) * sizeof(int)); // allocate an array with a size 1 less than the current one

    if (indexToRemove != 0)
        memcpy(temp, array, indexToRemove * sizeof(int)); // copy everything BEFORE the index

    if (indexToRemove != (sizeOfArray - 1))
        memcpy(temp+indexToRemove, array+indexToRemove+1, (sizeOfArray - indexToRemove - 1) * sizeof(int)); // copy everything AFTER the index

    free (array);
    return temp;
}

int main()
{
    int howMany = 20;
    int* test = malloc(howMany * sizeof(int*));

    for (int i = 0; i < howMany; ++i)
        (test[i]) = i;

    printf("%d\n", test[16]);
    remove_element(test, howMany, 16);
    --howMany;
    printf("%d\n", test[16]);
    return 0;
}

a few words on memory management/abstract data types

Finally, something to consider: there are possible issues both with using malloc to return memory to a user that is expected to be freed by the user, and with freeing memory that a user malloced. In general, it's less likely that memory management will be confusing and hard to handle if you design your code units such that memory allocation is handled within a single logical code unit.

For instance, you might create an abstract data type module that allowed you to create an integer array using a struct that holds a pointer and a length, and then all manipulation of that data goes through functions taking the structure as a first parameter. This also allows you, except within that module, to avoid having to do calculations like elemNumber * sizeof(elemType). Something like this:

struct MyIntArray
{
   int* ArrHead;
   int ElementSize;

   // if you wanted support for resizing without reallocating you might also
   //   have your Create function take an initialBufferSize, and:
   // int BufferSize;
};

void MyIntArray_Create(struct MyIntArray* This, int numElems /*, int initBuffSize */);
void MyIntArray_Destroy(struct MyIntArray* This);
bool MyIntArray_RemoveElement(struct MyIntArray* This, int index);
bool MyIntArray_InsertElement(string MyIntArray* THis, int index, int Value);

etc.

This is a basically implementing some C++-like functionality in C, and it's IMO a very good idea, especially if you are starting from scratch and you want to create anything more than a very simple application. I know of some C developers that really don't like this idiom, but it has worked well for me.

The nice thing about this way of implementing things is that anything in your code that was using the function to remove an element would not ever be touching the pointer directly. This would allow several different parts of your code to store a pointer to your abstract array structure, and when the pointer to the actual data of the array was reallocated after the element was removed, all variables pointing to your abstract array would be automatically updated.

In general, memory management can be very confusing, and this is one strategy that can make it less so. Just a thought.



回答2:

You don't actually change the passed pointer. You're only changing your copy of array.

void remove_element(int* array, int sizeOfArray, int indexToRemove)
{
    int* temp = malloc((sizeOfArray - 1) * sizeof(int*));

    free (array); /* Destroys the array the caller gave you. */
    array = temp; /* Temp is lost. This has **no effect** for the caller. */
}

So after the function the array still points to where it used to point BUT, you've also freed it, which adds insult to injury.

Try something like this:

void remove_element(int **array, int sizeOfArray, int indexToRemove)
                        ^^
{
    int *temp = malloc((sizeOfArray - 1) * sizeof(int*));
    /* More stuff. */

    free(*array);
    *array = temp;
}

There is also a C FAQ: Change passed pointer.



回答3:

@cnicutar is right (+1), but also, you write:

memcpy(temp+(indexToRemove * sizeof(int*)), temp+((indexToRemove+1) * sizeof(int*)), sizeOfArray - indexToRemove); // copy everything AFTER the index

while it should be:

memmove(temp+(indexToRemove), temp+(indexToRemove+1), sizeOfArray - indexToRemove); // copy everything AFTER the index

Since the multiplication by the size of int* is done by the compiler (that's pointer arithmetic)

Also, when moving overlaying memory areas, use memmove and not memcpy.



回答4:

Further: the second argument to your second memcpy call should be based on array, not on temp, right? And shouldn't you be mallocing and copying based on sizeof int and not based on sizeof int*, since your arrays store integers and not pointers? And don't you need to multiply the number of bytes you're copying (the last argument to memcpy) by sizeof int as well?

Also, watch the case where indexToRemove == 0.



回答5:

There are a few problems with that code :

(a) When allocating memory, you need to make sure to use the correct type with sizeof. For an array of int eg., you allocate a memory block with a size that is a multiple of sizeof(int). So :

int* test = malloc(howMany * sizeof(int*));

should be :

int* test = malloc(howMany * sizeof(int));

(b) You don't free the memory for the array at the end of main.

(c) memcpy takes the amount of bytes to copy as the third parameter. So, you need to again make sure to pass a multiple of sizeof(int). So :

memcpy(temp, array, cnt);

should be :

memcpy(temp, array, cnt * sizeof(int));

(d) when copying items from the old array to the new array, make sure to copy the correct data. For example, there are indexToRemove items before the item at index indexToRemove, not one less. Similarly, you'll need to make sure that you copy the correct amount of items after the item that needs to be removed.

(e) When incrementing a pointer, you don't need to multiply with sizeof(int) - that's done implicitly for you. So :

temp + (cnt * sizeof(int))

should really be :

temp + cnt

(f) In your remove_element function, you assign a value to the local variable array. Any changes to local variables are not visible outside of the function. So, after the call to remove_element ends, you won't see the change in main. One way to solve this, is to return the new pointer from the function, and assign it in main :

test = remove_element(test, howMany, 16);


回答6:

All the other answers make good points about the various problems/bugs in the code.

But, why reallocate at all (not that the bugs are all related to reallocation)? The 'smaller' array will fit fine in the existing block of memory:

// Note: untested (not even compiled) code; it also doesn't do any
//   checks for overflow, parameter validation, etc.
int remove_element(int* array, int sizeOfArray, int indexToRemove)
{
    // assuming that sizeOfArray is the count of valid elements in the array
    int elements_to_move = sizeOfArray - indexToRemove - 1;

    memmove( &array[indexToRemove], &array[indexToRemove+1], elements_to_move * sizeof(array[0]));

    // let the caller know how many elements remain in the array
    //  of course, they could figure this out themselves...
    return sizeOfArray - 1;
}