In quicksort If the array is randomized, does it use the median value of 3 to select the rotation value?

I have compared the execution times of various sampling algorithms. Surprisingly, the simplest, when the first item is always selected, is the fastest. Perhaps this is due to the fact that I am filling the array with random data.

If the array was randomized (shuffled) does it matter? For example, choosing medium 3 as the pivot is always (?) Better to select the first element as the hinge. But that's not what I noticed. Is this because if the array is already randomized, there would be no reason to assume sorting, and using the environment assumes some degree of sorting exists?

+3


source to share


2 answers


The worst execution time for quicksort is O(nยฒ)

. Quicksort is only in the middle case a fast sorting algorithm.

To achieve the average lead time O(n log n)

, you need to select a random rotation element. But instead of picking a random pivot element, you can shuffle the list and select the first element.
To verify this, you can look like this: let all the elements be in a certain order. Shuffling means that you are using a random permutation in the list of items, so the random item will be in the first position as well as all other positions. You can also see this by shuffling the list, randomly selecting one of all the elements for the first element, then randomly selecting one element of the other (not yet orienting elements) for the second element, etc.

If your list is already a randomly generated list, you can directly select the first item as a pivot without mixing up again.



Thus, selecting the first item is the fastest due to the randomly generated input, but selecting thrid or last will also be as fast as selecting the first.

All other ways to select a collapse element must calculate something (median or random number or something like that), but they have no advantage over random selection.

+1


source


Essentially the last answer, but I believe it will add more information.

Surprisingly, the simplest one, where the first item is always selected, is the fastest.

This is not really surprising since you mentioned that you are testing an algorithm with random data. In reality, the percentage of nearly sorted and sorted data is much higher than one might expect. Take historical data for example, when you collect it into a log file, some items may not be in order, but most of them are already sorted. Unfortunately, the implementation of Quicksort, which receives the first (or last) element in the form of a rod is vulnerable to such entry, and it degenerates into O(n^2)

difficulty because in step section you split the array into two halves the size 1

, and n-1

so you get the n

partitions instead of log n

in the middle ...

This is why people decided to add some sort of randomization that would allow for as problematic input as possible. There are three well-known approaches:



  • to reshuffle the input, - Robert Sedgwick is quoted as saying, "the likelihood of getting performance O(n^2)

    with this approach is lower than the likelihood of being hit by a thunderbolt" :)

  • randomly select the pivot element - Wikipedia says on average the expected number of comparisons in this case 1.386 n log n

  • select the pivot as the median of three - Wikipedia says on average the expected number of comparisons in this case is 1.188 n log n

However, randomization is worth it . If you are shuffling the input array, that is O(n)

, which is dominated O(nlogn)

, but you need to take into account the cost of calling the method random(..)

n

once. With your simple approach, this can be avoided and is therefore faster.

See also:

+1


source







All Articles