Quicksort vs heapsort

2019-01-07 04:38发布

Both quicksort and heapsort do in-place sorting. Which is better? What are the applications and cases in which either is preferred?

11条回答
做自己的国王
2楼-- · 2019-01-07 05:02

To me there is a very fundamental difference between heapsort and quicksort: the latter uses a recursion. In recursive algorithms the heap grows with the number of recursions. This does not matter if n is small, but right now I am sorting two matrices with n=10^9 !!. The program takes almost 10 GB of ram and any extra memory will make my computer to start swapping to virtual disk memory. My disk is a RAM disk, but still swapping to it make a huge difference in speed. So in a statpack coded in C++ that includes adjustable dimension matrices, with size unknown in advance to the programmer, and nonparametric statistical kind of sorting I prefer the heapsort to avoid delays to uses with very big data matrices.

查看更多
爱情/是我丢掉的垃圾
3楼-- · 2019-01-07 05:07

Heapsort is O(N log N) guaranted, what is much better than worst case in Quicksort. Heapsort doesn't need more memory for another array to putting ordered data as is needed by Mergesort. So why do comercial applications stick with Quicksort? What Quicksort has that is so special over others implementations?

I've tested the algorithms myself and I've seen that Quicksort has something special indeed. It runs fast, much faster than Heap and Merge algorithms.

The secret of Quicksort is: It almost doesn't do unnecessary element swaps. Swap is time consuming.

With Heapsort, even if all of your data is already ordered, you are going to swap 100% of elements to order the array.

With Mergesort, it's even worse. You are going to write 100% of elements in another array and write it back in the original one, even if data is already ordered.

With Quicksort you don't swap what is already ordered. If your data is completely ordered, you swap almost nothing! Although there is a lot of fussing about worst case, a little improvement on the choice of pivot, any other than getting the first or last element of array, can avoid it. If you get a pivot from the intermediate element between first, last and middle element, it is suficient to avoid worst case.

What is superior in Quicksort is not the worst case, but the best case! In best case you do the same number of comparisons, ok, but you swap almost nothing. In average case you swap part of the elements, but not all elements, as in Heapsort and Mergesort. That is what gives Quicksort the best time. Less swap, more speed.

The implementation below in C# on my computer, running on release mode, beats Array.Sort by 3 seconds with middle pivot and by 2 seconds with improved pivot (yes, there is an overhead to get a good pivot).

static void Main(string[] args)
{
    int[] arrToSort = new int[100000000];
    var r = new Random();
    for (int i = 0; i < arrToSort.Length; i++) arrToSort[i] = r.Next(1, arrToSort.Length);

    Console.WriteLine("Press q to quick sort, s to Array.Sort");
    while (true)
    {
        var k = Console.ReadKey(true);
        if (k.KeyChar == 'q')
        {
            // quick sort
            Console.WriteLine("Beg quick sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff"));
            QuickSort(arrToSort, 0, arrToSort.Length - 1);
            Console.WriteLine("End quick sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff"));
            for (int i = 0; i < arrToSort.Length; i++) arrToSort[i] = r.Next(1, arrToSort.Length);
        }
        else if (k.KeyChar == 's')
        {
            Console.WriteLine("Beg Array.Sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff"));
            Array.Sort(arrToSort);
            Console.WriteLine("End Array.Sort at " + DateTime.Now.ToString("HH:mm:ss.ffffff"));
            for (int i = 0; i < arrToSort.Length; i++) arrToSort[i] = r.Next(1, arrToSort.Length);
        }
    }
}

static public void QuickSort(int[] arr, int left, int right)
{
    int begin = left
        , end = right
        , pivot
        // get middle element pivot
        //= arr[(left + right) / 2]
        ;

    //improved pivot
    int middle = (left + right) / 2;
    int
        LM = arr[left].CompareTo(arr[middle])
        , MR = arr[middle].CompareTo(arr[right])
        , LR = arr[left].CompareTo(arr[right])
        ;
    if (-1 * LM == LR)
        pivot = arr[left];
    else
        if (MR == -1 * LR)
            pivot = arr[right];
        else
            pivot = arr[middle];
    do
    {
        while (arr[left] < pivot) left++;
        while (arr[right] > pivot) right--;

        if(left <= right)
        {
            int temp = arr[right];
            arr[right] = arr[left];
            arr[left] = temp;

            left++;
            right--;
        }
    } while (left <= right);

    if (left < end) QuickSort(arr, left, end);
    if (begin < right) QuickSort(arr, begin, right);
}
查看更多
Juvenile、少年°
4楼-- · 2019-01-07 05:08

Heapsort has the benefit of having a worst running case of O(n*log(n)) so in cases where quicksort is likely to be performing poorly (mostly sorted data sets generally) heapsort is much preferred.

查看更多
Evening l夕情丶
5楼-- · 2019-01-07 05:12

Well if you go to architecture level...we use queue data structure in cache memory.so what ever is available in queue will get sorted.As in quick sort we have no issue dividing the array into any lenght...but in heap sort(by using array) it may so happen that the parent may not be present in the sub array available in cache and then it has to bring it in cache memory ...which is time consuming. That's quicksort is best!!

查看更多
家丑人穷心不美
6楼-- · 2019-01-07 05:14

To answer the original question and address some of the other comments here:

I just compared implementations of selection, quick, merge, and heap sort to see how they'd stack up against each other. The answer is that they all have their downsides.

TL;DR: Quick is the best general purpose sort (reasonably fast, stable, and mostly in-place) Personally I prefer heap sort though unless I need a stable sort.

Selection - N^2 - It's really only good for less than 20 elements or so, then it's outperformed. Unless your data is already sorted, or very, very nearly so. N^2 gets really slow really fast.

Quick, in my experience, is not actually that quick all the time. Bonuses for using quick sort as a general sort though are that it's reasonably fast and it's stable. It's also an in-place algorithm, but as it's generally implemented recursively, it will take up additional stack space. It also falls somewhere between O(n log n) and O(n^2). Timing on some sorts seem to confirm this, especially when the values fall within a tight range. It's way faster than selection sort on 10,000,000 items, but slower than merge or heap.

Merge sort is guaranteed O(n log n) since its sort is not data dependent. It just does what it does, regardless of what values you've given it. It's also stable, but very large sorts can blow out your stack if you're not careful about implementation. There are some complex in-place merge sort implementations, but generally you need another array in each level to merge your values into. If those arrays live on the stack you can run into issues.

Heap sort is max O(n log n), but in many cases is quicker, depending on how far you have to move your values up the log n deep heap. The heap can easily be implemented in-place in the original array, so it needs no additional memory, and it's iterative, so no worries about stack overflow while recursing. The huge downside to heap sort is that it is not a stable sort, which means it's right out if you need that.

查看更多
登录 后发表回答