Radix sort using bitwise operations

2019-03-17 02:28发布

问题:

First of all this is homework , and I found another topic talking about the same subject but there was no answer. Here is the problem:

Sorting by bit based on the assumption that the values ​​to be sorted are integers coded B bits (and therefore between 0 and 2B-1).

The main problem is how to make this kind of sort. Should I convert each integer to bits and compare them? Please do not give me the solution just a hint or an explanation of how to do it. Thanks for your help ! [EDIT] I found this script in the internet but i did not understand how it works :

#include <cstdlib>
#include <iostream>
#include <string>
#include <cctype>
#include<algorithm>
#include<string>
#include <iterator>
using namespace std;

// Radix sort comparator for 32-bit two's complement integers
class radix_test
{
    const int bit; // bit position [0..31] to examine
public:
    radix_test(int offset) : bit(offset) {} // constructor

    bool operator()(int value) const // function call operator
    {
        if (bit == 31) // sign bit
            return value < 0; // negative int to left partition
        else
            return !(value & (1 << bit)); // 0 bit to left partition
    }
};

// Least significant digit radix sort
void lsd_radix_sort(int *first, int *last)
{
    for (int lsb = 0; lsb < 32; ++lsb) // least-significant-bit
    {
        std::stable_partition(first, last, radix_test(lsb));
    }
}

// Most significant digit radix sort (recursive)
void msd_radix_sort(int *first, int *last, int msb = 31)
{
    if (first != last && msb >= 0)
    {
        int *mid = std::partition(first, last, radix_test(msb));
        msb--; // decrement most-significant-bit
        msd_radix_sort(first, mid, msb); // sort left partition
        msd_radix_sort(mid, last, msb); // sort right partition
    }
}

int main(int argc, char *argv[])
{

    int data[] = { 170, 45, 75, -90, -802, 24, 2, 66 };

    lsd_radix_sort(data, data + 8);
    // msd_radix_sort(data, data + 8);

    std::copy(data, data + 8, std::ostream_iterator<int>(std::cout, " "));

    system("PAUSE");
    return EXIT_SUCCESS;
}

回答1:

First of all, you don't need to convert an integer to bits, because it already is stored as bits. An int is usually 4 bytes, so 32 bits. You can access the bits using bit operators.

Radix sort is shown here in detail. https://en.wikipedia.org/wiki/Radix_sort

This example sorts based on base 10 digits.

To sort based on bit, you would change the algorithm slightly to use 2 instead of 10 in all places:

void radixsort(int *a, int n) {
...
  while (m / exp > 0) {
    int bucket[2] = { 0 };
    for (i = 0; i < n; i++)      bucket[a[i] / exp % 2]++;
    bucket[1] += bucket[0];
    for (i = n - 1; i >= 0; i--) b[--bucket[a[i] / exp % 2]] = a[i];
    for (i = 0; i < n; i++)      a[i] = b[i];
    exp *= 2;
...
  }
}

But if you needed to use bit wise operators instead, you could recognize that anything divided by 2 is simply >> 1, multiply by 2 is << 1, and modulo 2 is &1. By replacing exp with the bit position, we can rewrite as follows:

void radixsort(int *a, int n) {
  int i, b[MAX], m = a[0], bit = 0;
  for (i = 0; i < n; i++) if (a[i] > m) m = a[i];

  while ((m>>bit) > 0) {
    int bucket[2] = { 0 };
    for (i = 0; i < n; i++)      bucket[(a[i]>>bit) & 1]++;
    bucket[1] += bucket[0];
    for (i = n - 1; i >= 0; i--) b[--bucket[(a[i]>>bit) & 1]] = a[i];
    for (i = 0; i < n; i++)      a[i] = b[i];
    bit++;
...
  }
}

This sorts using a single bit. To use multiple bits, you'd need to make it more generic:

#define BITS 2
void radixsort(int *a, int n) {
  int i, b[MAX], m = a[0], pos = 0;
  int buckets=1<<BITS;
  int mask=buckets-1;
  for (i = 0; i < n; i++) if (a[i] > m) m = a[i];

  while ((m>>(pos*BITS)) > 0) {
    int bucket[1<<BITS] = { 0 };
    for (i = 0; i < n; i++)       bucket[(a[i]>>(pos*BITS)) & mask]++;
    for (i = 1; i < buckets; i++) bucket[i] += bucket[i - 1];
    for (i = n - 1; i >= 0; i--)  b[--bucket[(a[i]>>(pos*BITS)) & mask]] = a[i];
    for (i = 0; i < n; i++)       a[i] = b[i];
    pos++;
...
  }
}

This sorts using two bits, so 4 buckets are used for 00, 01, 10, and 11. 3 bits would use 8 buckets (000, 001, 010, 011, 100, 101, 110, 111).

You can see how increasing the BITS will make fewer passes, but the work done in each pass is larger.