QuickSort stack overflow for sorted arrays (works for other datasets)

So I tried my best to optimize the Quicksort algorithm to work as efficiently as possible, even for sorted or near-sorted arrays, using the pivot, which is the median of three values, and also using insertion sort for small partition sizes. I've tested my code for large arrays of random values ​​and it works, but when I pass in an already sorted array, I get an error (ironically, because it led me to search this site). I believe this is a problem with my recursive calls (I know that partitioning works for other datasets, at least), but I can't figure out what to change.

This is part of my first class of semester data structures, so any code review would help as well. Thank.

public void quickSort(ArrayList<String> data, int firstIndex, int numberToSort) {
    if (firstIndex < (firstIndex + numberToSort - 1))
        if (numberToSort < 16) {
            insertionSort(data, firstIndex, numberToSort);
        } else {
            int pivot = partition(data, firstIndex, numberToSort);
            int leftSegmentSize = pivot - firstIndex;
            int rightSegmentSize = numberToSort - leftSegmentSize - 1;
            quickSort(data, firstIndex, leftSegmentSize);
            quickSort(data, pivot + 1, rightSegmentSize);
        }
}



public int partition(ArrayList<String> data, int firstIndex, int numberToPartition) {
    int tooBigNdx = firstIndex + 1;
    int tooSmallNdx = firstIndex + numberToPartition - 1;

    String string1 = data.get(firstIndex);
    String string2 = data.get((firstIndex + (numberToPartition - 1)) / 2);
    String string3 = data.get(firstIndex + numberToPartition - 1);
    ArrayList<String> randomStrings = new ArrayList<String>();
    randomStrings.add(string1);
    randomStrings.add(string2);
    randomStrings.add(string3);
    Collections.sort(randomStrings);
    String pivot = randomStrings.get(1);
    if (pivot == string2) {
        Collections.swap(data, firstIndex, (firstIndex + (numberToPartition - 1)) / 2);
    }
    if (pivot == string3) {
        Collections.swap(data, firstIndex, firstIndex + numberToPartition - 1);
    }
    while (tooBigNdx < tooSmallNdx) {
        while ((tooBigNdx < tooSmallNdx) && (data.get(tooBigNdx).compareTo(pivot) <= 0)) {
            tooBigNdx++;
        }
        while ((tooSmallNdx > firstIndex) && (data.get(tooSmallNdx).compareTo(pivot) > 0)) {
            tooSmallNdx--;
        }
        if (tooBigNdx < tooSmallNdx) {// swap
            Collections.swap(data, tooSmallNdx, tooBigNdx);
        }
    }
    if (pivot.compareTo(data.get(tooSmallNdx)) >= 0) {
        Collections.swap(data, firstIndex, tooSmallNdx);
        return tooSmallNdx;
    } else {
        return firstIndex;
    }
}

      

+3


source to share


2 answers


In your method, partition

you sometimes use an out of range element:

String string1 = data.get(firstIndex);
String string2 = data.get((firstIndex + (numberToPartition - 1)) / 2);
String string3 = data.get(firstIndex + numberToPartition - 1);

      

(firstIndex + (numberToPartition - 1)) / 2

is not the index of the middle element. It will be(firstIndex + (firstIndex + (numberToPartition - 1))) / 2

= firstIndex + ((numberToPartition - 1) / 2)

...



In fact, if firstIndex > n/2

(where n

is the number of elements in the input file), you are using the element with an index less firstIndex

. For sorted arrays, this means that you select item in firstIndex

as the pivot item. So you get the recursion depth in

<code> Omega (n) </code>
      <br>
        <script async src=
" data-src="https://chart.googleapis.com/chart?cht=tx&chl=%5COmega(n)" class=" lazyloaded" src="https://fooobar.com/https://chart.googleapis.com/chart?cht=tx&chl=%5COmega(n)">,

which causes a stack overflow for large enough inputs.

+1


source


You can avoid it by not changing your algorithm too much. The trick is to optimize for the largest partition and only use recursion on the smallest. This usually means you need to change if

to while

. I can't test the java code right now, but it should look something like this:

public void quickSort(ArrayList<String> data, int firstIndex, int numberToSort) {
    while (firstIndex < (firstIndex + numberToSort - 1))
        if (numberToSort < 16) {
            insertionSort(data, firstIndex, numberToSort);
        } else {
            int pivot = partition(data, firstIndex, numberToSort);
            int leftSegmentSize = pivot - firstIndex;
            int rightSegmentSize = numberToSort - leftSegmentSize - 1;

            //only use recursion for the smallest partition
            if (leftSegmentSize < rightSegmentSize) {
                quickSort(data, firstIndex, leftSegmentSize);
                firstIndex = pivot + 1;
                numberToSort = rightSegmentSize;
            } else {
                quickSort(data, pivot + 1, rightSegmentSize);
                numberToSort = leftSegmentSize;
            }
        }
}

      



This ensures that the call stack is no larger than the size of the call stack O(log n)

, because with each call, you only recurse for an array of no larger than n/2

.

+2


source







All Articles