C++ QuickSort Pivot Optimization

2019-09-06 10:13发布

问题:

Currently, I have a barebones implementation of the quicksort algorithm to sort some randomly generated numbers. The sort is efficient, more so than merge sort. However, for specific number sets (e.g. reversed numbers that need to be sorted the other way around), I need to optimized the pivot.

int* partition (int* first, int* last);
void quickSort(int* first, int* last) {
    if (last - first <= 1) return;

    int* pivot = partition(first, last);
    quickSort(first, pivot);
    quickSort(pivot + 1, last);
}

int* partition (int* first, int* last) {
    int pivot = *(last - 1);
    int* i = first;
    int* j = last - 1;

    for (;;) {
        while (*i < pivot && i < last) i++;
        while (*j >= pivot && j > first) j--;
        if (i >= j) break;
        swap (*i, *j);
    }

    swap (*(last - 1), *i);

    return i;
}

So for this code, I either want to use a random number as the pivot for the partitioning step, or use the median of the first, middle, and last elements as the pivot.

How would I do this?

I'm new to sorting algorithms, and my understanding of them is not complete just yet.

回答1:

Just change these lines:

    int pivot = *(last - 1);
    …
    swap ((last - 1), i);

to something like:

    int* pos = (first + rand(last - first));
    int pivot = *pos;
    …
    swap (pos, i);

or

    int* pos = mean_of_three((last - 1), (first), ((first + last) / 2));
    int pivot = *pos;
    …
    swap (pos, i);

where mean_of_three take 3 pointers and returns a pointer to the mean.



回答2:

As you have already mentioned, selecting first or last element of the array as pivot is not the best practice and cause the algorithm to fall into O(n^2). the best choice of pivot selection algorithm depends on the data that your program is likely to encounter. If there is a chance that the data will be sorted or nearly sorted, then random pivot is a very good choice(and very easy to implement) to avoid the O(n^2) behavior. On the other hand, the expense of selecting the middle element rather than the first is minimal and provides quite effective protection against sorted data.

Then again, if you are assured that your data will not be sorted or nearly sorted, then the median-of-three partitioning strategy seems to be the best.

int PickPivotUsingMedian(int a[], int first, int last)
{
int mid = (first+ right)/2;
if (a[mid ] < a[first]) 
    swap(&a[first],&a[mid ]);
if (a[last] < a[first])
    swap(&a[first],&a[last]);
if (a[last]< a[mid ])
    swap(&a[mid ],&a[last]);

swap(&a[mid], &a[last - 1]);//since the largest is already in the right.
return a[last - 1];
}


回答3:

Choose a random index within the bounds of your array and use that element as the pivot.