Why is my quicksort so slow in python?

I tried to write quicksort in python (for learning algorithms) but I found it about 10 times slower than native sort. Here is the result:

16384 numbers:
native: 5.556 ms
quicksort: 96.412 ms

65536 numbers:
native: 27.190 ms
quicksort: 436.110 ms

262144 numbers:
native: 151.820 ms
quicksort: 1975.943 ms

1048576 numbers:
native: 792.091 ms
quicksort: 9097.085 ms

4194304 numbers:
native: 3979.032 ms
quicksort: 39106.887 ms

      

Does this mean there is something wrong with my implementation? Or is it okay because the native type uses a lot of low-level optimization?

However, I find it unacceptable for sorting 1 million numbers to take almost 10, although I wrote this just for teaching, not practical use. And my computer is pretty fast. Here's my code:

def quicksort(lst):
    quicksortinner(lst,0,len(lst)-1)

def quicksortinner(lst,start,end):
    if start>=end:
        return
    j=partition(lst,start,end)
    quicksortinner(lst,start,j-1)       
    quicksortinner(lst,j+1,end)

def partition(lst,start,end):
    pivotindex=random.randrange(start,end+1)
    swap(lst,pivotindex,end)    
    pivot=lst[end]
    i,j=start,end-1
    while True:
        while lst[i]<=pivot and i<=end-1:
            i+=1
        while lst[j]>=pivot and j>=start:
            j-=1
        if i>=j:
            break
        swap(lst,i,j)

    swap(lst,i,end)
    return i

def swap(lst,a,b):
    if a==b:
        return
    lst[a],lst[b]=lst[b],lst[a]

      

In the section, i scans right and j scans left (approach from algorithms). Earlier I tried a path where both move to the right (maybe more often) and there isn't much.

+3


source to share


2 answers


The native view is written in C. Your quicksort is written in pure Python. A speed difference of 10x is expected. If you run your code with PyPy , you should come close to native speed (PyPy uses JIT tracing to achieve high performance). Likewise, Cython will give a nice boost in speed (Cython is a Python-to-C compiler).

A way to tell if your algorithm is even in the same ball is to count the number of comparisons used by those sorting algorithms. In finely tuned code, comparison cost dominates over run time. Here's a tool for counting comparisons:

   class CountCmps(float):

       def __lt__(self, other):
           global cnt
           cnt += 1
           return float.__lt__(self, other)

>>> from random import random
>>> data = [CountCmps(random()) for i in range(10000)]
>>> cnt = 0
>>> data.sort()
>>> cnt
119883

      



Another factor is your random.randrange () call has a lot of clean Python steps and does more work than you might expect. This will be a non-trivial component of the overall runtime. Since choosing a random sample can be slow, consider using the median of the three methods to select a pivot point.

Also, calling the swap () function fails in CPython. Inserting this code will give you a speed boost.

As you can see, there is much more to Python optimization than just choosing a good algorithm. Hope this answer helps you further :-)

+6


source


You get a little speed up by switching to iteration instead of recursion , although most of this is probably due to native code being very fast.

I will illustrate this with a link to MergeSort . Apologies for not using QuickSort - they run at about the same speed, but MergeSort takes a little less time to wrap your head around, and the iterative version is easier to demonstrate.

Essentially, MergeSort sorts the string, splitting it in half, sorting the two separately (using it myself, of course!) And combining the results - sorted lists can be concatenated in O (n) so that results in a total of O (n log n).

Here is a simple recursive MergeSort algorithm:

def mergeSort(theList):
    if len(theList) == 1:
        return theList

    theLength = int(len(theList)/2)

    return mergeSorted( mergeSort(theList[0:theLength]), mergeSort(theList[theLength:]) )

def mergeSorted(theList1,theList2):
    sortedList = []
    counter1 = 0
    counter2 = 0

    while True:
        if counter1 == len(theList1):
            return sortedList + theList2[counter2:]

        if counter2 == len(theList2):
            return sortedList + theList1[counter1:]

        if theList1[counter1] < theList2[counter2]:
            sortedList.append(theList1[counter1])
            counter1 += 1
        else:
            sortedList.append(theList2[counter2])
            counter2 += 1

      

 



Just like you found this is beaten to the ground by the built-in sorting algorithm:

import timeit

setup = """from __main__ import mergeSortList
import random
theList = [random.random() for x in xrange(1000)]"""

timeit.timeit('theSortedList1 = sorted(theList)', setup=setup, number=1000)
#0.33633776246006164

timeit.timeit('theSortedList1 = mergeSort(theList)', setup=setup, number=1000)
#8.415547955717784

      

 



However, it is possible to increase the time slightly by eliminating the calls of recursive functions in the function mergeSort

(this also avoids the dangers of falling into the limits of recursion). This is done by starting with the basic elements and combining them in pairs, from the bottom up, instead of a top-down approach. For example:

def mergeSortIterative(theList):

    theNewList = map(lambda x: [x], theList)
    theLength = 1

    while theLength < len(theList):
        theNewNewList = []

        pairs = zip(theNewList[::2], theNewList[1::2])

        for pair in pairs:
            theNewNewList.append( mergeSorted( pair[0], pair[1] ) )

        if len(pairs) * 2 < len(theNewList):
            theNewNewList.append(theNewList[-1])

        theLength *= 2
        theNewList = theNewNewList

    return theNewNewList[0]

      

 



Now all the sorted list items are stored at each iteration, which significantly reduces memory requirements and eliminates recursive function calls. Running this gives about a 15% increase in speed in my work time - and it was the fast version

setup = """from __main__ import mergeSortIterative
import random
theList = [random.random() for x in xrange(1000)]"""

timeit.timeit('theSortedList1 = mergeSortIterative(theList)', setup=setup, number=1000)
#7.1798827493580575

      

 



So, I'm still not where next to the embedded version, but slightly better than before.

The recipe for an iterative QuickSort can be found here .

+1


source







All Articles