Efficient algorithm for partial sorting into N unsorted groups

Question

Efficient algorithm for partial sorting into N unsorted groups

I'm looking for an efficient algorithm to do the following: given an array of N elements, sort it so that the elements are found as M equal groups, where each group was unsorted, but the groups are sorted among themselves (all elements in one group are smaller than any element in the next group) ...

The easiest way is to just sort the entire array. But this is inefficient, especially if the number of groups is much less than the total number of items (for example, sorting a million items into 5 groups).

Currently, I decided to use the quickselect algorithm (specifically the Floyd-Rivest variant ), which sorts the array into 2 unsorted groups and then applies it M-1 times. A big improvement might be to use a divide and conquer approach for quick selection, first sorting the array into two groups, then sorting each half, and so on, until we have M groups. A kind of generalization of the unordered partial sort problem .

But I have a feeling that there might be a simpler and more efficient approach to the problem. Is there something I am missing? Any ideas? I need this to improve insert performance in my RBush JavaScript library (efficient R * -tree implementation).

+3

sorting algorithm selection-sort partitioning quickselect

Mourner 31 Aug 14 at 16:10

source to share

1 answer

btilly · Accepted Answer · 2014-08-31T16:54:47+0000

Here's a repeat of the problem. You need to find the M-1 ranking items at the same time and use them to split the array into M unsorted groups.

I would suggest starting with a standard quickselect with random centers chosen to be close to the median (do a random sub-selection trick to estimate this), for each case dividing the array by 2, and dividing the list of ranked items you need to find and 2 Continue this until you get to a level where you have a ranked item that can be found in your current group. Then switch to the Floyd-Rivest quickselect option to actually find that item and split the current group by 2. Then going backwards you can easily combine the M groups you want.

This approach has an expected runtime O(N log(M))

with pretty good constants. I doubt it is much better than it could possibly be.

Efficient algorithm for partial sorting into N unsorted groups

More articles: