Sorting algorithm using bitmask and binary operators

I am trying to understand how the following algorithm (in Java) works.

private static void sort(int[] values, k) 
{
    int mask = 0x00000001 << k;
    int insertIndex = 0;
    int ArrayList<Integer> big = new ArrayList<Integer>();

    for (int i = 0; i < values.length; ++i) {
        if ((values[i] & mask) == 0) 
            values[insertIndex++] = values[i];
        else
            big.add(values[i]);
    }

    for (int i = 0; i < big.size(); ++i)
        values[insertIndex++] = big.get(i);
}

public static void sort(int[] values) 
{
    for (int i = 0; i < 31; ++i)
        sort(values, i);
}

      

Here's what I haven't figured out yet:

First 0x00000001

(32 bits?) Is shifted left by k

. So now the mask is a different number. Then we check to see whether the current value added values[i]

by using mask

using Binary-And operator equal 0

.

I cannot understand the role of the proposal (values[i] & mask) == 0

. The second for-loop

mess with my head. And why do we public static void sort(int[] values)

only repeat 31 times?

This algorithm does not correctly sort negative integers. How so? How can it be changed so that negative integers are sorted as well?

This algorithm is said to use similar concepts to one of the well-known sorting algorithms. As heap sort , insertion sort , Quicksort , Merge-sorting , bucket-sorting or Radix-Sort . I ruled out the possibility of quick sorting , because it uses sections and recursion. Merge sort uses recursion and merges subarrays, so I eliminated that too. Insert-Sort is unlikely to be one of them due to the dramatic difference in time and complexity. Sorting attachments O(n^2)

, and this algorithm O(n)

.Bucket-sort doesn't actually sort anything, it just divides the array into sub-arrays, which can then be sorted using some sort of sorting algorithm. Heap Sort is a comparison-based algorithm, and these algorithms are not similar. So the only possible possibility is Radix-Sort , which is not a comparison based algorithm. Therefore, best of all, this algorithm is similar to Radix sort. Am I right or am I hopelessly lost?

+3


source to share


5 answers


Yes, it is an implementation of radix sort using bitwise operators. The implementation is not very intuitive due to the bitwise operators.

The algorithm works by sorting through the list one digit at a time. This is why there are 31 loops ... because there are only 32 bits in Java integers, and the leftmost bit indicates that the value is negative, so there are only 31 bits in a positive Java integer.

The mask is used to keep track of which location is being checked. 0x0001

- this place, 0x0002

- this place, 0x0004

- place of triples, and 0x0001 << n

- n-th place.

Sorting is done by putting all integers where the check bit is 0 on the left side of the array and all integers where the check bit is 1 on the right. It's pretty easy: mask & values[i] == 0

if the masked bit of the value is 0.

Things get tricky when we start moving variables around. Whenever we find a value with 0 in our check bit, we want to move it to the left. However, this requires overwriting the value. insertIndex

and i

start at 0. Every time we find the value at position "i" we need to move to the left, we move it to insertIndex and if we find the value we need to navigate to the correct one, we store it in a temporary arraylist for later ...



0 0 1 0 1 0   i and insertindex are the same, so we copy the 0 to itself
^

0 0 1 0 1 0   Again.
  ^

0 0 1 0 1 0   Here we find our first 1. We don't increment insertindex
    ^         We store the 1 in the temp array list. Nothing is copied.

0 0 0 0 1 0   This is a zero. We copy the zero from i to insertindex
    ^<^

0 0 0 0 1 0   Another 1 goes in the temp array list. Nothing is copied, we don't 
              increment insertindex.
      ^ ^

0 0 0 0 1 0   We copy the last zero from i to insertindex. Remember that the
      ^<<<^   zero we overwrite is actually safely in the third bit - it
              was copied earlier.

      

Now for the second loop of the loop, we take all the values ​​we stored in our list of temp arrays and put them on the empty right side of the array. Everything on this side is somewhere else on the "left side" of the array.

0 0 0 0 1 0   Retrieve the first value from the temp array list
        ^

0 0 0 0 1 1   Retrieve the second value from the temp array list
          ^

      

Hope it helps! If you want to increase this to negative values ​​try looking at Two's Complement Notation

+2


source


This is an implementation of a radix sort with a radius of 2. This variety looks least significant at first. The implementation only works with positive numbers.



Your function goes through the array 31 times. Each path is sorted based on the value of the k

-th bit. Note that when the algorithm k

reorders the numbers, it preserves the relative order of the elements with identical -th bits. This is critical to implementation as it allows the algorithm to preserve the relative ordering achieved early on.

+3


source


First, the way the question is phrased sounds like it’s some kind of homework, since you have the code and some supposedly true statements, but don’t have a name or source. If so, please stay ahead so we can help you in the right direction.

The operator checks int mask = 0x00000001 << k;

together with (values[i] & mask) == 0

whether the bit is set. Remember that "a and b = 1" if and only if a = 1 and b = 1, so all bits not in position k are necessarily 0, and a bit in position k is equal to that bit in values[i]

.

The for-loop is 31 times, not 30 (1..30 is 30 numbers + 1 for zero). The reason is that a Java integer is exactly 32 bits, with the highest value only used by negative numbers. This is also the reason why it does not work correctly with negative numbers.

+2


source


Yes, this is an implementation of Radix sort .

+1


source


It looks like the first call sort(values, i)

puts all integers whose least significant bit is 0 at the beginning of the array, and all integers whose least significant bit is 1 at the end of the array.
The second call puts all integers whose least significant bit is 0 at the beginning of the array, and all integers whose least significant bit is 1 at the end of the array (without changing the internal order of each group).
...
The last call puts all integers whose 31st bit is 0 at the beginning of the array, and all integers whose 31st bit is 1 at the end of the array (without changing the internal order of each group)

Thus, sort(int[] values)

sorts the array in ascending order, assuming it only contains positive integers (since it ignores the sign bit).

+1


source







All Articles